rstar2-reproduce/rStar2-Agent-14B

rstar2-reproduce/rStar2-Agent-14B is a 14 billion parameter math reasoning model that achieves performance comparable to 67B models through agentic reinforcement learning. Developed as part of the rStar2-Agent research, it excels at planning, reasoning, and autonomously using coding tools for complex problem-solving. This model is specifically optimized for mathematical tasks, efficiently exploring, verifying, and reflecting to solve problems. Its primary use case is advanced math reasoning and problem-solving leveraging agentic capabilities.

Warm
Public
14B
FP8
32768
License: mit
Hugging Face

No reviews yet. Be the first to review!