AIME 2025 Benchmark Leaderboard

All 30 problems from the 2025 American Invitational Mathematics Examination (AIME I and AIME II), testing olympiad-level mathematical reasoning with integer answers from 000-999. Used as an AI benchmark to evaluate large language models' ability to solve complex mathematical problems requiring multi-step logical deductions and structured symbolic reasoning.

Leaderboard

Top 50 models on AIME 2025 Benchmark Leaderboard (scores from public evaluations).

  1. 1Grok-4 Heavy100.0% on AIME 2025 Benchmark Leaderboard
  2. 1GPT-5.2100.0% on AIME 2025 Benchmark Leaderboard
  3. 1Kimi K2-Thinking-0905100.0% on AIME 2025 Benchmark Leaderboard
  4. 1GPT-5.2 Pro100.0% on AIME 2025 Benchmark Leaderboard
  5. 1Gemini 3 Pro100.0% on AIME 2025 Benchmark Leaderboard
  6. 6Claude Opus 4.699.8% on AIME 2025 Benchmark Leaderboard
  7. 7Gemini 3 Flash99.7% on AIME 2025 Benchmark Leaderboard
  8. 8LongCat-Flash-Thinking-260199.6% on AIME 2025 Benchmark Leaderboard
  9. 8GPT-5.1 High99.6% on AIME 2025 Benchmark Leaderboard
  10. 10Nemotron 3 Nano (30B A3B)99.2% on AIME 2025 Benchmark Leaderboard
  11. 11GPT OSS 20B High98.7% on AIME 2025 Benchmark Leaderboard
  12. 12GPT-5.1 Medium98.4% on AIME 2025 Benchmark Leaderboard
  13. 13Seed 2.0 Pro98.3% on AIME 2025 Benchmark Leaderboard
  14. 14Step-3.5-Flash97.3% on AIME 2025 Benchmark Leaderboard
  15. 15Sarvam-30B96.7% on AIME 2025 Benchmark Leaderboard
  16. 15GPT-5.1 Codex High96.7% on AIME 2025 Benchmark Leaderboard
  17. 15Sarvam-105B96.7% on AIME 2025 Benchmark Leaderboard
  18. 18Kimi K2.596.1% on AIME 2025 Benchmark Leaderboard
  19. 19DeepSeek-V3.2-Speciale96.0% on AIME 2025 Benchmark Leaderboard
  20. 20GLM-4.795.7% on AIME 2025 Benchmark Leaderboard
  21. 21GPT-594.6% on AIME 2025 Benchmark Leaderboard
  22. 21GPT-5 High94.6% on AIME 2025 Benchmark Leaderboard
  23. 23MiMo-V2-Flash94.1% on AIME 2025 Benchmark Leaderboard
  24. 24GPT-5.1 Thinking94.0% on AIME 2025 Benchmark Leaderboard
  25. 24GPT-5.194.0% on AIME 2025 Benchmark Leaderboard
  26. 24GPT-5.1 Instant94.0% on AIME 2025 Benchmark Leaderboard
  27. 27GLM-4.693.9% on AIME 2025 Benchmark Leaderboard
  28. 28Grok-393.3% on AIME 2025 Benchmark Leaderboard
  29. 29DeepSeek-V3.2 (Thinking)93.1% on AIME 2025 Benchmark Leaderboard
  30. 29DeepSeek-V3.293.1% on AIME 2025 Benchmark Leaderboard
  31. 31Seed 2.0 Lite93.0% on AIME 2025 Benchmark Leaderboard
  32. 32K-EXAONE-236B-A23B92.8% on AIME 2025 Benchmark Leaderboard
  33. 33o4-mini92.7% on AIME 2025 Benchmark Leaderboard
  34. 34GPT OSS 120B High92.5% on AIME 2025 Benchmark Leaderboard
  35. 35Qwen3-235B-A22B-Thinking-250792.3% on AIME 2025 Benchmark Leaderboard
  36. 36Grok 4 Fast92.0% on AIME 2025 Benchmark Leaderboard
  37. 37Grok-491.7% on AIME 2025 Benchmark Leaderboard
  38. 38GLM-4.7-Flash91.6% on AIME 2025 Benchmark Leaderboard
  39. 39Mercury 291.1% on AIME 2025 Benchmark Leaderboard
  40. 39GPT-5 mini91.1% on AIME 2025 Benchmark Leaderboard
  41. 41Grok-3 Mini90.8% on AIME 2025 Benchmark Leaderboard
  42. 42LongCat-Flash-Thinking90.6% on AIME 2025 Benchmark Leaderboard
  43. 43Nemotron 3 Super (120B A12B)90.2% on AIME 2025 Benchmark Leaderboard
  44. 44Qwen3 VL 235B A22B Thinking89.7% on AIME 2025 Benchmark Leaderboard
  45. 45DeepSeek-V3.2-Exp89.3% on AIME 2025 Benchmark Leaderboard
  46. 46GPT-5 Medium88.9% on AIME 2025 Benchmark Leaderboard
  47. 47Gemini 2.5 Pro Preview 06-0588.0% on AIME 2025 Benchmark Leaderboard
  48. 48Qwen3-Next-80B-A3B-Thinking87.8% on AIME 2025 Benchmark Leaderboard
  49. 49Step3-VL-10B87.7% on AIME 2025 Benchmark Leaderboard
  50. 50DeepSeek-R1-052887.5% on AIME 2025 Benchmark Leaderboard

Models tracked

Models with aime-2025 in their evaluation profile.

View task leaderboards →