hamishivi/OpenThinker3-1.5B-RLVE

OpenThinker3-1.5B-RLVE is a 1.5 billion parameter language model developed by hamishivi, fine-tuned from OpenThinker3 1.5B using Reinforcement Learning with Verifiable Environments (RLVE). This model demonstrates enhanced performance across various reasoning and problem-solving benchmarks, including AIME, OMEGA-500, OlympiadBench, and LiveCodeBench. It is specifically optimized for complex reasoning tasks and competitive programming challenges, showing significant improvements over its base model.

Cold

Public

Model Size: 1.5B

Quant: BF16

Ctx length: 131072

License: apache-2.0

Hugging Face

No reviews yet. Be the first to review!