brebis/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-feathered_webbed_chinchilla
brebis/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-feathered_webbed_chinchilla is a 0.5 billion parameter instruction-tuned language model, fine-tuned from Gensyn/Qwen2.5-0.5B-Instruct. It was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities in open language models. This model is optimized for tasks requiring improved mathematical reasoning, leveraging its training with the GRPO method. Its 131072 token context length supports processing extensive inputs for complex problem-solving.