Qwen/Qwen2-0.5B

Qwen2-0.5B is a 0.5 billion parameter base language model from the Qwen2 series developed by Qwen. This Transformer-based model features SwiGLU activation, attention QKV bias, and group query attention, alongside an improved tokenizer for multiple natural languages and codes. It serves as a foundational model designed for further post-training applications like SFT, RLHF, or continued pretraining, rather than direct text generation.

Warm

Public

Model Size: 0.5B

Quant: BF16

Ctx length: 32768

License: apache-2.0

Hugging Face

No reviews yet. Be the first to review!