welcome
Ars Technica

Ars Technica

Technology

Technology

New study accuses LM Arena of gaming its popular AI benchmark

Ars Technica
Summary
Nutrition label

84% Informative

LM Arena was created in 2023 as a research project at UC Berkeley .

Users feed a prompt into two AI models in the "Chatbot Arena" and evaluate the output to vote on the one they like more.

This data is aggregated in the leaderboard that shows which models people like the most.

Google 's Gemini 2.5 Pro debuted at the top of the LM Arena leaderboard this year .

VR Score

88

Informative language

90

Neutral language

50

Article tone

formal

Language

English

Language complexity

52

Offensive language

not offensive

Hate speech

not hateful

Attention-grabbing headline

not detected

Known propaganda techniques

not detected

Time-value

medium-lived

Affiliate links

no affiliate links