ContextualAI/LMUnit-qwen2.5-72b
ContextualAI's LMUnit-qwen2.5-72b is a 72 billion parameter language model, fine-tuned from Qwen2.5-72B, specifically optimized for fine-grained evaluation using natural language unit tests. It takes a prompt, response, and unit test as input, producing a continuous score (1-5) indicating how well the response satisfies the test criteria. This model excels in evaluating language model outputs, achieving leading performance across preference, direct scoring, and fine-grained unit test evaluation tasks.