nvidia/Llama-3.3-Nemotron-70B-Feedback

The nvidia/Llama-3.3-Nemotron-70B-Feedback model is a 70 billion parameter large language model developed by NVIDIA, built upon the Meta-Llama-3.3-70B-Instruct foundation. It is specifically fine-tuned using Supervised Finetuning to provide feedback on the helpfulness of LLM-generated responses to user queries. This model is designed to improve performance in general-domain, open-ended tasks through Inference-Time-Scaling, particularly as part of a system for augmenting models on leaderboards like Arena Hard.

Warm

Public

Model Size: 70B

Quant: FP8

Ctx length: 32768

License: nvidia-open-model-license

Hugging Face