simplescaling/s1-32B

The simplescaling/s1-32B is a 32 billion parameter reasoning model, fine-tuned from Qwen2.5-32B-Instruct by simplescaling. It is notable for achieving strong reasoning performance, matching o1-preview, despite being trained on only 1,000 examples. This model demonstrates test-time scaling through a technique called budget forcing, making it suitable for complex problem-solving tasks.

Warm
Public
32B
FP8
32768
License: apache-2.0
Hugging Face

No reviews yet. Be the first to review!