MCP Atlas Benchmark Leaderboard
MCP Atlas is a benchmark for evaluating AI models on scaled tool use capabilities, measuring how well models can coordinate and utilize multiple tools across complex multi-step tasks.
Leaderboard
Top 18 models on MCP Atlas Benchmark Leaderboard (scores from public evaluations).
- 1Gemini 3.5 Flash83.6% on MCP Atlas Benchmark Leaderboard
- 2Claude Opus 4.777.3% on MCP Atlas Benchmark Leaderboard
- 3GPT-5.575.3% on MCP Atlas Benchmark Leaderboard
- 4Qwen3.6 Plus74.1% on MCP Atlas Benchmark Leaderboard
- 5DeepSeek-V4-Pro-Max73.6% on MCP Atlas Benchmark Leaderboard
- 6GLM-5.171.8% on MCP Atlas Benchmark Leaderboard
- 7Gemini 3.1 Pro69.2% on MCP Atlas Benchmark Leaderboard
- 8DeepSeek-V4-Flash-Max69.0% on MCP Atlas Benchmark Leaderboard
- 9GLM-567.8% on MCP Atlas Benchmark Leaderboard
- 10GPT-5.467.2% on MCP Atlas Benchmark Leaderboard
- 11Qwen3.6-35B-A3B62.8% on MCP Atlas Benchmark Leaderboard
- 12Claude Opus 4.662.7% on MCP Atlas Benchmark Leaderboard
- 13Claude Opus 4.562.3% on MCP Atlas Benchmark Leaderboard
- 14Claude Sonnet 4.661.3% on MCP Atlas Benchmark Leaderboard
- 15GPT-5.260.6% on MCP Atlas Benchmark Leaderboard
- 16GPT-5.4 mini57.7% on MCP Atlas Benchmark Leaderboard
- 17Gemini 3 Flash57.4% on MCP Atlas Benchmark Leaderboard
- 18GPT-5.4 nano56.1% on MCP Atlas Benchmark Leaderboard
Models tracked
Models with mcp-atlas in their evaluation profile.
- No models linked yet.