grimjim/gemma-3-12b-it-MPOAdd-v1

grimjim/gemma-3-12b-it-MPOAdd-v1 is a 12 billion parameter instruction-tuned Gemma model derived from google/gemma-3-12b-it. This model utilizes Magnitude-Preserving Orthogonal Addition (MPOAdd) to enhance refusal behavior against perceived harms, making it more strongly enforce safety concerns. It achieves this by geometrically tweaking the model's layers to amplify the directional component of refusal while preserving layer norms, with minimal perplexity loss compared to the baseline.

Warm
Public
Vision
12B
FP8
32768
License: gemma
Hugging Face