Kimi K2 Base

Kimi K2 base model is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained on 15.5 trillion tokens with the MuonClip optimizer, this is the foundation model before instruction tuning. It demonstrates strong performance on knowledge, reasoning, and coding benchmarks while being optimized for agentic capabilities.

Context

Benchmarks