TachyHealth/Gazal-R1-32B-GRPO-preview
Gazal-R1-32B-GRPO-preview is a 32.8 billion parameter causal language model developed by TachyHealth, built upon Qwen 3 32B. It is specifically designed and fine-tuned for medical reasoning and clinical decision-making, leveraging a two-stage training pipeline including Group Relative Policy Optimization (GRPO). This model excels at diagnostic reasoning, treatment planning, and prognostic assessment, achieving state-of-the-art performance on medical benchmarks like MedQA and MMLU Pro (Medical).
No reviews yet. Be the first to review!