OpenAI's evaluation shows rival Anthropic' Claude models less hallucinatory

29 Aug 2025, 11:43 AM

However, in terms of jailbreaking evaluations Claude models performed less well compared to OpenAI o3 and OpenAI o4-mini.

Team Head&Tale

AI startup rivals OpenAI and Anthropic came together to evaluate each other's publicly released models and one of the findings show that Anthropic's Claude models were less hallucinatory than OpenAI's models.

The two companies said that the goal of this external evaluation of each other is to demonstrate how labs can collaborate on issues of safety and alignment.

"We believe this approach supports accountable and transparent evaluation, helping to ensure that each lab’s models continue to be tested against new and challenging scenarios," said OpenAI in a blog.

On hallucination evaluations, Claude models had an extremely high rate of refusals as much as 70%. This indicates that Claude's models are aware of their uncertainty and often avoid making inaccurate statements.

"However, the high refusal rate limits utility, and the overall accuracy rate for the examples in these evaluations where the models did choose to answer is still low," it noted.

By contrast, OpenAI models o3 and OpenAI o4-mini demonstrated lower refusal rates with higher hallucination rates in a challenging setting that limits tool use such as browsing. 

In terms of jailbreaking evaluations, which focus on the general robustness of trained-in safeguards, Claude models performed less well compared to OpenAI o3 and OpenAI o4-mini, it noted.