yongchao98/R1-Code-Interpreter-14B

yongchao98/R1-Code-Interpreter-14B is a 14 billion parameter Qwen-2.5 model, fine-tuned for step-by-step code reasoning using multi-turn supervised fine-tuning and reinforcement learning. It excels at autonomously deciding when and how to invoke code for reasoning and planning tasks, demonstrating emergent self-checking behavior via code generation. With a 32768 token context length, it is optimized for complex problem-solving that requires code interpretation.

Warm
Public
14B
FP8
32768
License: mit
Hugging Face