electroglyph/gemma-3-4b-it-unslop-GSPO
The electroglyph/gemma-3-4b-it-unslop-GSPO model is a 4.3 billion parameter instruction-tuned variant of Google's Gemma-3-4b-it, fine-tuned using the GSPO technique. It features a 32768 token context length and represents an experimental iteration focusing on refining the 'unslop' finetuning approach. This model explores the impact of lower rank settings in GSPO for achieving different behavioral characteristics compared to previous finetunes.