context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16

The Llama 3.2-3B-Instruct-FP16 model, developed by Meta, is a 3.21 billion parameter instruction-tuned multilingual large language model with a 32768 token context length. Optimized for multilingual dialogue, it excels in agentic retrieval, summarization, and chat applications. This model utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) and is fine-tuned using SFT and RLHF for helpfulness and safety, outperforming many open-source and closed chat models on industry benchmarks.

Warm
Public
3.2B
BF16
32768
License: llama3.2
Hugging Face