ARC-AGI Benchmark Leaderboard

The Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI) is a benchmark designed to test general intelligence and abstract reasoning capabilities through visual grid-based transformation tasks. Each task consists of 2-5 demonstration pairs showing input grids transformed into output grids according to underlying rules, with test-takers required to infer these rules and apply them to novel test inputs. The benchmark uses colored grids (up to 30x30) with 10 discrete colors/symbols, designed to measure human-like general fluid intelligence and skill-acquisition efficiency with minimal prior knowledge.

Leaderboard

Top 7 models on ARC-AGI Benchmark Leaderboard (scores from public evaluations).

Models tracked

Models with arc-agi in their evaluation profile.

  • No models linked yet.

View task leaderboards →