nvidia/Qwen3-Nemotron-14B-BRRM

The nvidia/Qwen3-Nemotron-14B-BRRM is a 14 billion parameter Branch-and-Rethink Reasoning Reward Model developed by NVIDIA. This model implements a novel two-turn reasoning framework for evaluating LLM-generated responses, performing adaptive branching and branch-conditioned rethinking to focus on critical evaluation dimensions. It achieves state-of-the-art performance on major reward modeling benchmarks, making it suitable for integrating into RLHF pipelines to improve LLM response quality.

Warm

Public

Model Size: 14B

Quant: FP8

Ctx length: 32768

License: other

Hugging Face

No reviews yet. Be the first to review!