LiveCodeBench Benchmark Leaderboard

LiveCodeBench is a holistic and contamination-free evaluation benchmark for large language models for code. It continuously collects new problems from programming contests (LeetCode, AtCoder, CodeForces) and evaluates four different scenarios: code generation, self-repair, code execution, and test output prediction. Problems are annotated with release dates to enable evaluation on unseen problems released after a model's training cutoff.

Leaderboard

Top 50 models on LiveCodeBench Benchmark Leaderboard (scores from public evaluations).

  1. 1DeepSeek-V4-Pro-Max93.5% on LiveCodeBench Benchmark Leaderboard
  2. 2DeepSeek-V4-Flash-Max91.6% on LiveCodeBench Benchmark Leaderboard
  3. 3DeepSeek-V3.2 (Thinking)83.3% on LiveCodeBench Benchmark Leaderboard
  4. 3DeepSeek-V3.283.3% on LiveCodeBench Benchmark Leaderboard
  5. 5MiniMax M283.0% on LiveCodeBench Benchmark Leaderboard
  6. 6LongCat-Flash-Thinking-260182.8% on LiveCodeBench Benchmark Leaderboard
  7. 7Nemotron 3 Super (120B A12B)81.2% on LiveCodeBench Benchmark Leaderboard
  8. 8Grok-3 Mini80.4% on LiveCodeBench Benchmark Leaderboard
  9. 9Grok 4 Fast80.0% on LiveCodeBench Benchmark Leaderboard
  10. 10Grok-379.4% on LiveCodeBench Benchmark Leaderboard
  11. 10Grok-4 Heavy79.4% on LiveCodeBench Benchmark Leaderboard
  12. 10LongCat-Flash-Thinking79.4% on LiveCodeBench Benchmark Leaderboard
  13. 13Grok-479.0% on LiveCodeBench Benchmark Leaderboard
  14. 14MiniMax M2.178.0% on LiveCodeBench Benchmark Leaderboard
  15. 15DeepSeek-V3.2-Exp74.1% on LiveCodeBench Benchmark Leaderboard
  16. 16DeepSeek-R1-052873.3% on LiveCodeBench Benchmark Leaderboard
  17. 17GLM-4.572.9% on LiveCodeBench Benchmark Leaderboard
  18. 18Nemotron Nano 9B v271.1% on LiveCodeBench Benchmark Leaderboard
  19. 19Qwen3 235B A22B70.7% on LiveCodeBench Benchmark Leaderboard
  20. 19GLM-4.5-Air70.7% on LiveCodeBench Benchmark Leaderboard
  21. 21Gemini 2.5 Pro Preview 06-0569.0% on LiveCodeBench Benchmark Leaderboard
  22. 22Mercury 267.0% on LiveCodeBench Benchmark Leaderboard
  23. 23Llama 3.1 Nemotron Ultra 253B v166.3% on LiveCodeBench Benchmark Leaderboard
  24. 24Qwen3 32B65.7% on LiveCodeBench Benchmark Leaderboard
  25. 25MiniMax M1 80K65.0% on LiveCodeBench Benchmark Leaderboard
  26. 26Ministral 3 (14B Reasoning 2512)64.6% on LiveCodeBench Benchmark Leaderboard
  27. 27Mistral Small 463.6% on LiveCodeBench Benchmark Leaderboard
  28. 28QwQ-32B63.4% on LiveCodeBench Benchmark Leaderboard
  29. 29Qwen3 30B A3B62.6% on LiveCodeBench Benchmark Leaderboard
  30. 30MiniMax M1 40K62.3% on LiveCodeBench Benchmark Leaderboard
  31. 31Ministral 3 (8B Reasoning 2512)61.6% on LiveCodeBench Benchmark Leaderboard
  32. 32DeepSeek R1 Distill Llama 70B57.5% on LiveCodeBench Benchmark Leaderboard
  33. 33DeepSeek R1 Distill Qwen 32B57.2% on LiveCodeBench Benchmark Leaderboard
  34. 34DeepSeek-V3.156.4% on LiveCodeBench Benchmark Leaderboard
  35. 35Qwen2.5 72B Instruct55.5% on LiveCodeBench Benchmark Leaderboard
  36. 36Min istral 3 (3B Reasoning 2512)54.8% on LiveCodeBench Benchmark Leaderboard
  37. 37Phi 4 Reasoning53.8% on LiveCodeBench Benchmark Leaderboard
  38. 38Kimi K2-Instruct-090553.7% on LiveCodeBench Benchmark Leaderboard
  39. 39Phi 4 Reasoning Plus53.1% on LiveCodeBench Benchmark Leaderboard
  40. 39DeepSeek R1 Distill Qwen 14B53.1% on LiveCodeBench Benchmark Leaderboard
  41. 41Magistral Small 250651.3% on LiveCodeBench Benchmark Leaderboard
  42. 42Magistral Medium50.3% on LiveCodeBench Benchmark Leaderboard
  43. 43QwQ-32B-Preview50.0% on LiveCodeBench Benchmark Leaderboard
  44. 43DeepSeek R1 Zero50.0% on LiveCodeBench Benchmark Leaderboard
  45. 45DeepSeek-V3 032449.2% on LiveCodeBench Benchmark Leaderboard
  46. 46LongCat-Flash-Chat48.0% on LiveCodeBench Benchmark Leaderboard
  47. 47Llama 4 Maverick43.4% on LiveCodeBench Benchmark Leaderboard
  48. 48DeepSeek R1 Distill Llama 8B39.6% on LiveCodeBench Benchmark Leaderboard
  49. 49DeepSeek-V337.6% on LiveCodeBench Benchmark Leaderboard
  50. 49DeepSeek R1 Distill Qwen 7B37.6% on LiveCodeBench Benchmark Leaderboard

Models tracked

Models with livecodebench in their evaluation profile.

View task leaderboards →