AI Startup Probably Secures $9M to Improve AI Accuracy and Reduce Hallucinations

AI startup Probably has raised $9 million in seed funding to develop technology that reduces AI hallucinations and improves response accuracy with advanced validation systems and transparent audit trails.

Shivangi Yadav

Jun 28, 2026 - 14:12

AI Startup Probably Secures $9M to Improve AI Accuracy and Reduce Hallucinations

IMAGE CREDITS: PROBABLY

As large language models continue to become more capable, one challenge has remained difficult to eliminate: hallucinations. Even the most advanced AI systems still generate incorrect or fabricated information. While developers have introduced methods to reduce these mistakes, the industry is still searching for a dependable long-term solution.

AI startup Probably believes it has found a more rigorous way to address the problem. The company has raised $9 million in seed funding led by Andreessen Horowitz to build technology designed to prevent AI-generated errors before they ever reach users.

According to founder Peter Elias, the company’s objective is to prevent hallucinations and factual inaccuracies in AI responses while achieving reliability levels closer to the 99.99% accuracy expected of traditional deterministic software systems. Reaching that standard, he says, requires rethinking many of the assumptions behind today’s AI development.

Probably’s first product is a data science platform built to generate fast insights from complex datasets. Every response includes both source citations and a complete audit trail explaining how each conclusion was produced, a capability that has become increasingly important for enterprise AI applications.

To ensure accuracy, the company built what Elias describes as a “data science mech suit” around the language model. Instead of relying solely on the AI’s first response, each answer is immediately checked by a deterministic validation system. If the output does not match the underlying dataset, it is rejected and corrected before being delivered. The language model itself has also been trained alongside this validation layer, allowing the overall system to prioritise both speed and precision.

“What we discovered is that the stronger your validation framework becomes, the less powerful the AI model actually needs to be,” Elias said. “When you provide clear context and remove ambiguity, the model doesn’t have to work nearly as hard to produce the correct answer.”

That approach allows Probably’s software to operate using significantly smaller AI models than many competitors. Elias says the current platform runs on a model that is “four classes weaker than frontier models,” making it efficient enough to run on local desktop hardware instead of relying on large cloud-based data centres. As a result, customers can significantly reduce token costs for running AI workloads.

The strategy comes as many businesses are looking for ways to control rising AI operating expenses while improving reliability.

Although the company’s first application focuses on data science, Elias believes the same validation framework can eventually support other precision-critical industries, including accounting, healthcare, financial analysis, and other fields where factual accuracy is essential.

“I think it’s interesting that the major AI labs haven’t seriously pursued this direction,” Elias said. “They’re not necessarily incentivised to reduce corrections because repeated interactions generate more usage. Our focus is on making sure the answer is right the first time.”

Tags:

What's Your Reaction?

Like 0

Dislike 0

Love 0

Funny 0

Angry 0

Sad 0

Wow 0

Shivangi Yadav Shivangi Yadav reports on startups, technology policy, and other significant technology-focused developments in India for TechAmerica.Ai. She previously worked as a research intern at ORF.