princeton-nlp/Llama-3-Base-8B-SFT-KTO

princeton-nlp/Llama-3-Base-8B-SFT-KTO is an 8 billion parameter language model developed by princeton-nlp, based on the Llama-3 architecture. This model is specifically fine-tuned using the SimPO (Simple Preference Optimization) method, which utilizes a reference-free reward mechanism. It is designed for tasks benefiting from preference optimization without requiring explicit reference responses, offering a distinct approach to alignment.

Warm

Public

Model Size: 8B

Quant: FP8

Ctx length: 8192

Hugging Face