AI Model Leaderboards
Explore various benchmarks and see how different AI models perform against each other.
Intelligence Index by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| 1 | GPT-5 (high) | 68.95 | OpenAI |
| 2 | GPT-5 (medium) | 67.53 | OpenAI |
| 3 | Grok 4 | 67.52 | xAI |
| 4 | o3 | 67.07 | OpenAI |
| 5 | O3 Pro | 67.07 | OpenAI |
| 6 | o4-mini (high) | 65.05 | OpenAI |
| 7 | Gemini 2.5 Pro | 64.63 | |
| 8 | Gemini 2 5 Pro 05 06 | 64.63 | |
| 9 | Gemini 2 5 Pro 03 25 | 64.63 | |
| 10 | Gpt 5 Mini | 63.70 | OpenAI |
Coding Index by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| 1 | Grok 4 | 63.81 | xAI |
| 2 | o4-mini (high) | 63.48 | OpenAI |
| 3 | Gemini 2.5 Pro | 61.46 | |
| 4 | Gemini 2 5 Pro 05 06 | 61.46 | |
| 5 | Gemini 2 5 Pro 03 25 | 61.46 | |
| 6 | Qwen3 235B 2507 (Reasoning) | 60.60 | Alibaba |
| 7 | O3 Pro | 59.69 | OpenAI |
| 8 | o3 | 59.69 | OpenAI |
| 9 | Gemini 2.5 Pro (May' 25) | 59.29 | |
| 10 | DeepSeek R1 0528 | 58.66 | Unknown |
Math Index by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| 1 | GPT-5 (high) | 97.53 | OpenAI |
| 2 | Grok 4 | 96.67 | xAI |
| 3 | o4-mini (high) | 96.43 | OpenAI |
| 4 | Grok 3 mini Reasoning (high) | 96.27 | xAI |
| 5 | Qwen3 235B 2507 (Reasoning) | 96.20 | Alibaba |
| 6 | GPT-5 (medium) | 95.40 | OpenAI |
| 7 | o3 | 94.77 | OpenAI |
| 8 | O3 Pro | 94.77 | OpenAI |
| 9 | DeepSeek R1 0528 | 93.80 | Unknown |
| 10 | Gemini 2 5 Pro 05 06 | 92.70 |
Lcr by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| 1 | GPT-5 (high) | 0.76 | OpenAI |
| 2 | GPT-5 (medium) | 0.73 | OpenAI |
| 3 | o3 | 0.69 | OpenAI |
| 4 | O3 Mini | 0.69 | OpenAI |
| 5 | O3 Pro | 0.69 | OpenAI |
| 6 | O3 Mini High | 0.69 | OpenAI |
| 7 | Grok 4 | 0.68 | xAI |
| 8 | Qwen3 235B 2507 (Reasoning) | 0.67 | Alibaba |
| 9 | Gemini 2.5 Pro | 0.66 | |
| 10 | Gemini 2 5 Pro 05 06 | 0.66 |
Intelligence vs. Price by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| 1 | GPT-5 (high) | 68.95 | OpenAI |
| 2 | GPT-5 (medium) | 67.53 | OpenAI |
| 3 | Grok 4 | 67.52 | xAI |
| 4 | O3 Pro | 67.07 | OpenAI |
| 5 | O3 Mini | 67.07 | OpenAI |
| 6 | o3 | 67.07 | OpenAI |
| 7 | o4-mini (high) | 65.05 | OpenAI |
| 8 | Gemini 2 5 Pro 05 06 | 64.63 | |
| 9 | Gemini 2.5 Pro | 64.63 | |
| 10 | Gemini 2 5 Pro 03 25 | 64.63 |
Intelligence vs. Output Speed by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| 1 | GPT-5 (high) | 69.00 | OpenAI |
| 2 | GPT-5 (medium) | 67.50 | OpenAI |
| 3 | Grok 4 | 67.50 | xAI |
| 4 | o3 | 67.10 | OpenAI |
| 5 | O3 Mini | 67.10 | OpenAI |
| 6 | O3 Pro | 67.10 | OpenAI |
| 7 | o4-mini (high) | 65.00 | OpenAI |
| 8 | Gemini 2.5 Pro | 64.60 | |
| 9 | Gemini 2 5 Pro 05 06 | 64.60 | |
| 10 | Gemini 2 5 Pro 03 25 | 64.60 |
Intelligence vs. Seconds to Output 500 Tokens, including reasoning model 'thinking' time by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| 1 | GPT-5 (high) | 69.00 | OpenAI |
| 2 | Grok 4 | 67.50 | xAI |
| 3 | GPT-5 (medium) | 67.50 | OpenAI |
| 4 | O3 Pro | 67.10 | OpenAI |
| 5 | O3 Mini | 67.10 | OpenAI |
| 6 | o3 | 67.10 | OpenAI |
| 7 | o4-mini (high) | 65.00 | OpenAI |
| 8 | Gemini 2 5 Pro 05 06 | 64.60 | |
| 9 | Gemini 2.5 Pro | 64.60 | |
| 10 | Gemini 2 5 Pro 03 25 | 64.60 |
Tokens used to run all evaluations in the Artificial Analysis Intelligence Index by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| No scores available for this leaderboard yet. | |||
Intelligence vs. Intelligence by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| 1 | GPT-5 (high) | 69.00 | OpenAI |
| 2 | Grok 4 | 67.50 | xAI |
| 3 | GPT-5 (medium) | 67.50 | OpenAI |
| 4 | O3 Mini | 67.10 | OpenAI |
| 5 | o3 | 67.10 | OpenAI |
| 6 | O3 Pro | 67.10 | OpenAI |
| 7 | o4-mini (high) | 65.00 | OpenAI |
| 8 | Gemini 2.5 Pro | 64.60 | |
| 9 | Gemini 2 5 Pro 05 06 | 64.60 | |
| 10 | Gemini 2 5 Pro 03 25 | 64.60 |
Intelligence vs. Context Window by Gemini Endpoint
More →| Rank | Model | Score | Organization |
|---|---|---|---|
| 1 | GPT-5 (high) | 69.00 | OpenAI |
| 2 | GPT-5 (medium) | 67.50 | OpenAI |
| 3 | Grok 4 | 67.50 | xAI |
| 4 | O3 Pro | 67.10 | OpenAI |
| 5 | O3 Mini | 67.10 | OpenAI |
| 6 | o3 | 67.10 | OpenAI |
| 7 | o4-mini (high) | 65.00 | OpenAI |
| 8 | Gemini 2.5 Pro | 64.60 | |
| 9 | Gemini 2 5 Pro 05 06 | 64.60 | |
| 10 | Gemini 2 5 Pro 03 25 | 64.60 |