TheBloke/Llama-2-7B-Chat-fp16

TheBloke/Llama-2-7B-Chat-fp16 is a 7 billion parameter generative text model developed by Meta, fine-tuned for dialogue use cases. This model utilizes an optimized transformer architecture and is specifically designed for assistant-like chat in English. It outperforms many open-source chat models on various benchmarks and offers a 4096-token context length, making it suitable for interactive conversational AI applications.

Warm
Public
7B
FP8
4096
Hugging Face