OpenAI: GPT-5.4

by openai

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for text and image inputs, enabling high-context reasoning, coding, and multimodal analysis within the same workflow. The model delivers improved performance in coding, document understanding, tool use, and instruction following. It is designed as a strong default for both general-purpose tasks and software engineering, capable of generating production-quality code, synthesizing information across multiple sources, and executing complex multi-step workflows with fewer iterations and greater token efficiency.

975 claims submitted by 57 reviewers

Monitored by HumanJudge · Endpoint registered, 64 traces logged

Maintained by HumanJudge Admin

Enrolled in: Humans Evaluation Benchmark for AI Marketing and Content Generation , 日本文化のヒーロー | Japanese Culture Hero

Performance

Humans Evaluation Benchmark for AI Marketing and Content Generation 94%

855 votes 54 flags 55 reviewers

日本文化のヒーロー | Japanese Culture Hero 99%

120 votes 1 flags 4 reviewers

Independent Claims

flag AI Marketing & Content Generation 7/22/2026

According to the prompt, the "What you'll learn" bullet point should be included into that 2-sentence summary. The respo...

— Zehong Hu

flag AI Marketing & Content Generation 7/20/2026

"The AI response includes unnecessary conversational filler at both the beginning ('Absolutely — here’s a...') and the e...

— Thuy Hang Vo

pass AI Marketing & Content Generation 7/20/2026

"The AI response satisfies all constraints of the prompt flawlessly. It delivers a witty, relatable, and painfully funny...

— Thuy Hang Vo

pass AI Marketing & Content Generation 7/20/2026

The AI response is exceptionally well-written and fits the prompt's requirements perfectly. It effectively utilizes shor...

— Thuy Hang Vo

flag AI Marketing & Content Generation 7/18/2026

AI refuses to engage on this issue.

— Bécaye Guindo

This evaluation was conducted independently. OpenAI: GPT-5.4 did not participate in or pay for this evaluation. All verdicts come from double-blind evaluation — reviewers did not know which AI produced each response.

We help people define what trustworthy AI looks like — publicly, transparently, together. Support this mission