Sarvam-105B

Sarvam-105B is Sarvam AI's flagship open-source Mixture-of-Experts reasoning model built for complex reasoning, coding, and agentic workflows. It uses 128 sparse experts with Multi-head Latent Attention for efficient long-context inference and was pre-trained on 12 trillion tokens spanning code, mathematics, multilingual, and web data.

Context

Benchmarks