OpenAI unveils GPT-5.4 with Pro and Thinking model variants

OpenAI introduces GPT-5.4 with Pro and Thinking versions, offering stronger reasoning, improved coding ability, and advanced AI performance for developers and businesses.

Shivangi Yadav

Mar 8, 2026 - 12:48

OpenAI unveils GPT-5.4 with Pro and Thinking model variants

Image Credits: OpenAI

On Thursday, OpenAI introduced GPT-5.4, a new foundation model that the company describes as “our most capable and efficient frontier model for professional work.” Alongside the standard release, OpenAI is also offering GPT-5.4 in a reasoning-focused version, GPT-5.4 Thinking, and a high-performance variant, GPT-5.4 Pro.

In the API, the model will be available with context windows of up to 1 million tokens, making it by far the largest context window OpenAI has offered so far.

OpenAI also highlighted improved token efficiency, noting that GPT-5.4 completed the same tasks with significantly fewer tokens than its predecessor.

The new release delivers notably stronger benchmark performance, including record-setting results on the OSWorld-Verified and WebArena Verified computer-use benchmarks. GPT-5.4 also achieved a record score of 83% on OpenAI’s GDPval benchmark, which measures performance on knowledge-work tasks.

GPT-5.4 also moved into the top position on Mercor’s APEX-Agents benchmark, which is intended to evaluate professional capabilities in fields such as law and finance, according to a statement from Mercor CEO Brendan Foody.

“[GPT-5.4] excels at creating long-horizon deliverables such as slide decks, financial models, and legal analysis,” Foody said in the statement, “delivering top performance while running faster and at a lower cost than competitive frontier models.”

GPT-5.4 continues OpenAI’s broader push to reduce hallucinations and factual mistakes. The company said the new model was 33% less likely to make errors in individual claims than GPT-5.2, and complete responses were 18% less likely to contain errors overall.

As part of the rollout, OpenAI has redesigned how the GPT-5.4 API handles tool calling, introducing a new system called Tool Search. Previously, system prompts had to include definitions for every available tool when the model was invoked — a setup that could consume a large number of tokens as the tool count increased. Under the new approach, the model can retrieve tool definitions only when it should, making requests faster and cheaper in systems with many withs.

OpenAI has also added a new safety evaluation aimed at testing the chain-of-thought behaviour of commentary models that generate, showing how they reason through multi-step tasks. AI safety researchers have long been concerned that reasoning models might misrepresent the chain of thought, and testing has shown that this can occur under certain conditions.

According to OpenAI, the new evaluation suggests that deception is less likely in the Thinking version of GPT-5.4, “suggesting that the model cannot hide its reasoning and that CoT monitoring remains an effective safety tool.”

Tags:

What's Your Reaction?

Like 0

Dislike 0

Love 0

Funny 0

Angry 0

Sad 0

Wow 0

Shivangi Yadav Shivangi Yadav reports on startups, technology policy, and other significant technology-focused developments in India for TechAmerica.Ai. She previously worked as a research intern at ORF.