hamishivi/OpenThinker3-1.5B-RLVE

OpenThinker3-1.5B-RLVE is a 1.5 billion parameter language model developed by hamishivi, fine-tuned from OpenThinker3 1.5B using Reinforcement Learning with Verifiable Environments (RLVE). This model demonstrates enhanced performance across various reasoning and problem-solving benchmarks, including AIME, OMEGA-500, OlympiadBench, and LiveCodeBench. It is specifically optimized for complex reasoning tasks and competitive programming challenges, showing significant improvements over its base model.

Cold
Public
1.5B
BF16
131072
License: apache-2.0
Hugging Face

No reviews yet. Be the first to review!