akshayballal/Qwen3-1.7B-Pubmed-16bit-GRPO

The akshayballal/Qwen3-1.7B-Pubmed-16bit-GRPO is a 1.7 billion parameter Qwen3-based language model developed by akshayballal. It was fine-tuned from unsloth/qwen3-1.7b-unsloth-bnb-4bit using Unsloth and Huggingface's TRL library, enabling 2x faster training. This model is specifically optimized for biomedical text processing, making it suitable for applications requiring understanding and generation of content from sources like PubMed.

Warm

Public

Model Size: 2B

Quant: BF16

Ctx length: 32768

License: apache-2.0

Hugging Face

No reviews yet. Be the first to review!