nvidia/Qwen3-Nemotron-32B-GenRM-Principle

The nvidia/Qwen3-Nemotron-32B-GenRM-Principle is a 32 billion parameter large language model, built upon the Qwen3 foundation, specifically fine-tuned as a Generative Reward Model. It predicts the extent to which LLM-generated responses fulfill user-specified principles by assigning a reward score. This model achieves top performance on both the JudgeBench (81.4%) and RM-Bench (86.2%) benchmarks, making it highly effective for evaluating LLM response quality against defined criteria.

Warm

Public

Model Size: 32B

Quant: FP8

Ctx length: 32768

License: other

Hugging Face