AALF/gemma-2-27b-it-SimPO-37K-100steps

AALF/gemma-2-27b-it-SimPO-37K-100steps is a 27 billion parameter instruction-tuned model based on the Google Gemma-2 architecture. This model was fine-tuned using the SimPO framework on a curated dataset of 37,040 preference data points derived from UltraFeedback, with responses annotated by ArmoRM-Llama3-8B-v0.1. It is specifically optimized for generating high-quality, preferred responses, achieving a 77.09% WinRate on AlpacaEval2.0.

Cold
Public
27B
FP8
32768
License: gemma
Hugging Face