Qwen/Qwen2.5-0.5B

Qwen2.5-0.5B is a 0.49 billion parameter causal language model developed by Qwen, featuring a 32,768-token context length. This base model, part of the Qwen2.5 series, incorporates transformer architecture with RoPE, SwiGLU, and RMSNorm. It offers significant improvements in knowledge, coding, mathematics, and multilingual support across 29 languages, serving as a foundation for further fine-tuning.

Warm

Public

Model Size: 0.5B

Quant: BF16

Ctx length: 32768

License: apache-2.0

Hugging Face