deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Qwen-32B is a 32.8 billion parameter language model developed by DeepSeek-AI, distilled from the larger DeepSeek-R1 model and based on the Qwen2.5 architecture. It is specifically fine-tuned using reasoning data generated by DeepSeek-R1, excelling in complex reasoning, mathematical, and coding tasks with a context length of 131072 tokens. This model demonstrates strong performance across various benchmarks, often outperforming larger models in its class due to its specialized distillation process.