Chatbot Arena Elo Leaderboard

Leaderboard illusion: How big tech skewed AI rankings on Chatbot Arena

Meta, Google, and OpenAI allegedly exploited undisclosed private testing on Chatbot Arena to secure top rankings, raising concerns about fairness and transparency in AI model benchmarking. A handful ...

TechCrunch

Study accuses LM Arena of helping top AI labs game its benchmark

A new paper from AI lab Cohere, Stanford, MIT, and Ai2 accuses LM Arena, the organization behind the popular crowdsourced AI benchmark Chatbot Arena, of helping a select group of AI companies achieve ...

Seeking Alpha

AI's Great Flattening: What Happens When Everyone Is State-Of-The-Art?

In 2025, frontier AI model performance converged dramatically, with performance benchmarks like LMSYS’s Chatbot Arena revealing near-parity among top systems—a seismic shift suggesting excellence ...

Ars Technica

New study accuses LM Arena of gaming its popular AI benchmark

The rapid proliferation of AI chatbots has made it difficult to know which models are actually improving and which are falling behind. Traditional academic benchmarks only tell you so much, which has ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results