DeepSeek R1 Zero
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, Dee
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, …
Intelligence
Features
Pros
Cons
Use cases
API inference · Fine-tuning · Benchmarking
AI models used
DeepSeek R1 Zero
FAQ
How much does DeepSeek R1 Zero cost?
Free tier
Does DeepSeek R1 Zero have a free plan?
Limited or no free tier
Is there an API?
Yes