unsloth/DeepSeek-R1-Distill-Qwen-32B

The DeepSeek-R1-Distill-Qwen-32B model, developed by DeepSeek AI, is a 32 billion parameter language model distilled from the larger DeepSeek-R1 reasoning model and based on the Qwen2.5 architecture. It is specifically optimized for complex reasoning, mathematical, and coding tasks, demonstrating strong performance across various benchmarks. This model leverages advanced distillation techniques to transfer the reasoning capabilities of a larger model into a more compact form, making it suitable for applications requiring high-level cognitive abilities.

Cold

Public

Model Size: 32B

Quant: FP8

Ctx length: 32768

License: apache-2.0

Hugging Face