MiniMax M2.7

MiniMax M2.7 features model self-improvement driving productivity innovation. It builds complex agent harnesses independently to accomplish highly complex productivity tasks. M2.7 demonstrates excellent performance in real-world software engineering including end-to-end project delivery, log analysis, code security, and ML tasks. On SWE-Pro it scores 56.22%, nearly matching Opus. It excels in professional office domains achieving the highest ELO among open-source models on GDPval-AA (1495), with significant improvement in complex editing for Office Suite. M2.7 maintains 97% skill adherence on 40 complex skills cases.

Context —

Benchmarks

GPQA
MMLU
MMLU-Pro
AIME 2025
MATH
HumanEval
MMMU
LiveCodeBench
SWE-Bench Verified

← All models Compare models Benchmark scores