DeepSeek-V4-Pro-Max logo

DeepSeek-V4-Pro-Max

DeepSeek-V4-Pro-Max is the maximum reasoning effort mode of DeepSeek-V4-Pro, a 1.6T-parameter MoE model with 49B activated parameters and a 1M-token context window. It introduces a hybrid attention ar

DeepSeek-V4-Pro-Max is the maximum reasoning effort mode of DeepSeek-V4-Pro, a 1.6T-parameter MoE model with 49B activated parameters and a 1M-token context window. It introduces a hybrid attention architecture combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) for dramatically improved long-context efficiency, requiring only 27% of single-token inference FLOPs and 10% of KV cache compared with DeepSeek-V3.2 at 1M-token context. The model also incorporates Manifol

LLMs
Free tier

Intelligence

Popularity49/100
Monthly visits
Growth
Updated2026-05-21

Features

GPQA
MMLU
MMLU-Pro
AIME 2025
MATH
HumanEval

Pros

    Cons

      Use cases

      API inference · Fine-tuning · Benchmarking

      AI models used

      DeepSeek-V4-Pro-Max

      FAQ

      How much does DeepSeek-V4-Pro-Max cost?

      Free tier

      Does DeepSeek-V4-Pro-Max have a free plan?

      Limited or no free tier

      Is there an API?

      Yes