princeton-nlp/Llama-3-Instruct-8B-KTO-v0.2

princeton-nlp/Llama-3-Instruct-8B-KTO-v0.2 is an 8 billion parameter instruction-tuned language model developed by princeton-nlp, based on the Llama 3 architecture. This model is specifically optimized using the KTO (Kahneman-Tversky Optimization) method, as detailed in the SimPO preprint, to enhance preference alignment without requiring a reference-free reward. It is designed for general instruction following tasks, leveraging its 8192 token context length for robust performance.

Warm

Public

Model Size: 8B

Quant: FP8

Ctx length: 8192

Hugging Face

No reviews yet. Be the first to review!