simplescaling/s1-32B
The simplescaling/s1-32B is a 32 billion parameter reasoning model, fine-tuned from Qwen2.5-32B-Instruct by simplescaling. It is notable for achieving strong reasoning performance, matching o1-preview, despite being trained on only 1,000 examples. This model demonstrates test-time scaling through a technique called budget forcing, making it suitable for complex problem-solving tasks.
No reviews yet. Be the first to review!