Magistral Medium

Trained solely with reinforcement learning on top of Mistral Medium 3, Magistral Medium is a reasoning model that achieves strong performance on complex math and code tasks without relying on distillation from existing reasoning models. The training uses an RLVR framework with modifications to GRPO, enabling improved reasoning ability and multilingual consistency.

Context —

Benchmarks

GPQA
MMLU
MMLU-Pro
AIME 2025
MATH
HumanEval
MMMU
LiveCodeBench
SWE-Bench Verified

← All models Compare models Benchmark scores