princeton-nlp/Llama-3-Instruct-8B-IPO
Llama-3-Instruct-8B-IPO is an 8 billion parameter instruction-tuned language model developed by princeton-nlp. This model is fine-tuned using the SimPO method, a reference-free preference optimization technique, making it particularly effective for tasks requiring nuanced preference alignment. It is designed for general instruction following with an 8192 token context length.