brebis/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-feathered_webbed_chinchilla

brebis/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-feathered_webbed_chinchilla is a 0.5 billion parameter instruction-tuned language model, fine-tuned from Gensyn/Qwen2.5-0.5B-Instruct. It was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities in open language models. This model is optimized for tasks requiring improved mathematical reasoning, leveraging its training with the GRPO method. Its 131072 token context length supports processing extensive inputs for complex problem-solving.

Warm

Public

Model Size: 0.5B

Quant: BF16

Ctx length: 131072

Hugging Face