Qwen/Qwen2.5-0.5B

Qwen2.5-0.5B is a 0.49 billion parameter causal language model developed by Qwen, featuring a 32,768-token context length. This base model, part of the Qwen2.5 series, incorporates transformer architecture with RoPE, SwiGLU, and RMSNorm. It offers significant improvements in knowledge, coding, mathematics, and multilingual support across 29 languages, serving as a foundation for further fine-tuning.

Warm
Public
0.5B
BF16
32768
License: apache-2.0
Hugging Face