unsloth/DeepSeek-R1-Distill-Qwen-32B

The DeepSeek-R1-Distill-Qwen-32B model, developed by DeepSeek AI, is a 32 billion parameter language model distilled from the larger DeepSeek-R1 reasoning model and based on the Qwen2.5 architecture. It is specifically optimized for complex reasoning, mathematical, and coding tasks, demonstrating strong performance across various benchmarks. This model leverages advanced distillation techniques to transfer the reasoning capabilities of a larger model into a more compact form, making it suitable for applications requiring high-level cognitive abilities.

Cold
Public
32B
FP8
32768
License: apache-2.0
Hugging Face