princeton-nlp/Llama-3-Instruct-8B-KTO-v0.2

princeton-nlp/Llama-3-Instruct-8B-KTO-v0.2 is an 8 billion parameter instruction-tuned language model developed by princeton-nlp, based on the Llama 3 architecture. This model is specifically optimized using the KTO (Kahneman-Tversky Optimization) method, as detailed in the SimPO preprint, to enhance preference alignment without requiring a reference-free reward. It is designed for general instruction following tasks, leveraging its 8192 token context length for robust performance.

Warm
Public
8B
FP8
8192
Hugging Face

No reviews yet. Be the first to review!