akshayballal/Qwen3-1.7B-Pubmed-16bit-GRPO

The akshayballal/Qwen3-1.7B-Pubmed-16bit-GRPO is a 1.7 billion parameter Qwen3-based language model developed by akshayballal. It was fine-tuned from unsloth/qwen3-1.7b-unsloth-bnb-4bit using Unsloth and Huggingface's TRL library, enabling 2x faster training. This model is specifically optimized for biomedical text processing, making it suitable for applications requiring understanding and generation of content from sources like PubMed.

Warm
Public
2B
BF16
32768
License: apache-2.0
Hugging Face

No reviews yet. Be the first to review!