princeton-nlp/Llama-3-Instruct-8B-CPO-v0.2

The princeton-nlp/Llama-3-Instruct-8B-CPO-v0.2 is an 8 billion parameter instruction-tuned causal language model developed by Princeton NLP. Based on the Llama 3 architecture, this model is fine-tuned using the SimPO (Simple Preference Optimization) method, which utilizes a reference-free reward for preference optimization. It is designed for general instruction-following tasks, leveraging its 8192-token context length for diverse applications.

Warm

Public

Model Size: 8B

Quant: FP8

Ctx length: 8192

Hugging Face