agentica-org/DeepScaleR-1.5B-Preview

DeepScaleR-1.5B-Preview is a 1.5 billion parameter language model developed by agentica-org, fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B using distributed reinforcement learning. It is specifically optimized for mathematical reasoning and problem-solving, achieving 43.1% Pass@1 accuracy on AIME 2024. This model demonstrates strong performance in mathematical benchmarks, surpassing larger models like OpenAI's O1-Preview with a significantly smaller parameter count.

Warm
Public
1.5B
BF16
131072
License: mit
Hugging Face

No reviews yet. Be the first to review!