ChatGPT Images 2.0 shows strong improvement in rendering accurate text

ChatGPT’s Images 2.0 model delivers notable improvements in generating clear, readable text within images, enhancing design, content, and visual workflows.

Shivangi Yadav

Apr 25, 2026 - 18:32

It used to be fairly easy to tell the difference between human-made and AI-generated visuals. Just a couple of years ago, image generation tools often struggled with text so badly that even simple designs would produce distorted results. A Mexican restaurant menu, for example, might include items like "enchilada," "churros," "burrito," or "margarita" instead of correctly spelt dishes.

That limitation has now been significantly reduced with the launch of OpenAI's new ChatGPT Images 2.0 system. In recent tests, the model generated a Mexican food menu that looked realistic enough to use in a real restaurant, with only minor details—such as unusual prices,g like $13.50 for ceviche—hinting it was AI-generated.

For comparison, earlier models such as DALL·E 3 produced noticeably weaker text accuracy, particularly when integrated into ChatGPT, which did not natively generate images at the time.

AI image systems have long struggled to render text accurately because of the way diffusion models work. These systems typically reconstruct images from noisy input, focusing more on visual patterns than on precise lettering. As explained by Asmelash Teka Hadgu, founder and CEO of Lesan AI, in 2024, written content occupies only a small portion of an image, making it harder for models to prioritise accurate spelling during generation.

Researchers have since explored alternative methods, including autoregressive approaches, which predict images in a way more similar to how large language models generate text.

OpenAI did not clarify during a recent press briefing what exact architecture powers Images 2.0. However, the company said the model includes "thinking capabilities," allowing it to refine outputs, generate multiple image variations from a single prompt, and even verify its own results. These enhancements enable more complex outputs, such as marketing materials in different formats and multi-panel comic strips.

The system is also designed to handle non-Latin scripts better, improving rendering accuracy for languages such as Japanese, Korean, Hindi, and Bengali. However, its knowledge cutoff of December 2025 may limit accuracy in highly recent contexts.

According to OpenAI, "Images 2.0 brings an unprecedented level of specificity and fidelity to image creation. It can not only conceptualise more sophisticated images, but it actually brings that vision to life eﬀectively, able to follow instructions, preserve requested details, and render the fine-grained elements that often break image models: small text, iconography, UI elements, dense compositions, and subtle stylistic constraints, all at up to 2K resolution."

While more advanced, image generation under the new system takes slightly longer than standard text-based prompts, with complex outputs such as comics requiring a few minutes to generate.

All ChatGPT and Codex users will gain access to Images 2.0 starting Tuesday, while paid users will receive expanded capabilities. The company is also releasing the gpt-image-2 API, with pricing based on output quality and resolution.

Tags:

What's Your Reaction?

Like 0

Dislike 0

Love 0

Funny 0

Angry 0

Sad 0

Wow 0

Shivangi Yadav Shivangi Yadav reports on startups, technology policy, and other significant technology-focused developments in India for TechAmerica.Ai. She previously worked as a research intern at ORF.