KAKA22/CodeRM-8B

KAKA22/CodeRM-8B is an 8 billion parameter instruction-tuned model based on Llama3.1-8B-Instruct, specifically designed for generating high-quality Python unit tests. It was trained on 60,000 synthetic unit tests derived from CodeFeedback-Filtered-Instruction and TACO datasets. This model excels at creating unit tests for code solutions, demonstrating performance comparable to much larger models like Llama3.1-70B-Instruct in code reward modeling tasks.

Warm
Public
8B
FP8
32768
License: apache-2.0
Hugging Face