Why AI Chatbots Hallucinate, According to OpenAI Researchers
OpenAI researchers have discovered that hallucinations in large language models occur when models are trained to guess answers instead of expressing uncertainty. The solution lies in redesigning evaluation metrics to discourage guessing and improve the accuracy of AI responses.
OpenAI researchers have identified a key reason why large language models (LLMs) like GPT-5 and Claude often "hallucinate" — generating inaccurate information presented as facts. These hallucinations are a major challenge for the performance of AI chatbots, affecting even the most advanced models, including OpenAI's GPT-5 and Anthropic's Claude.
According to a paper released by OpenAI on Thursday, the underlying cause of hallucinations in LLMs lies in the way they are trained. LLMs are optimized to guess when uncertain, rather than admit uncertainty. This "fake it till you make it" approach helps models perform well in tests but leads to false information generation when certainty is required.
Hallucinations in AI models occur when a chatbot confidently generates incorrect information, presenting it as truth. This problem has plagued language models as they are trained primarily to perform well in evaluation metrics that reward answering questions, even when the answer is uncertain or inaccurate. The issue is widespread, impacting popular LLMs like GPT-5 and Claude.
OpenAI's researchers emphasized that evaluations used to assess language models tend to penalize models for expressing uncertainty. As a result, models like GPT-5 are incentivized to guess when they don’t have enough information, which leads to the creation of hallucinated content. “Language models are optimized to be good test-takers, and guessing when uncertain improves test performance,” the researchers wrote.
Lack of Uncertainty in AI Models
The problem is compounded by the fact that LLMs are always in "test-taking mode," essentially viewing every question as a binary choice — right or wrong, black or white. This contrasts with the real world, where uncertainty and ambiguity are much more common. Humans often learn how to express uncertainty over time, but AI chatbots are primarily trained using tests that penalize them for doing so.
As OpenAI researchers pointed out, the issue lies with the evaluation systems used to grade these models. Current evaluation methods encourage guesses even when the model doesn't have the full information needed. "The root problem is the abundance of evaluations that are not aligned,” the researchers explained. The primary evaluations must be redesigned to discourage models from guessing when they’re uncertain.
A Potential Fix: Redesigning Evaluation Metrics
The good news, according to OpenAI's researchers, is that there is a solution. Redesigning evaluation metrics is key to addressing the issue. OpenAI proposes updating the widely used accuracy-based evaluations to discourage guessing. If the evaluations continue to reward lucky guesses, the models will continue learning to guess, even when it’s inaccurate.
OpenAI's Next Steps
The solution to hallucinations in LLMs lies in improving training methods and aligning evaluations with how AI models should behave in real-world situations. OpenAI has recognized this issue and is working on redesigning its evaluation processes to help prevent guessing and improve the overall reliability of AI chatbots.
While OpenAI has made strides in addressing this issue, hallucinations continue to pose a significant challenge in the development of AI systems. Moving forward, the company aims to update its evaluation systems to better align with AI model performance in real-world use cases.
What's Your Reaction?
Like
0
Dislike
0
Love
0
Funny
0
Angry
0
Sad
0
Wow
0