rl-research/DR-Tulu-SFT-8B

DR-Tulu-SFT-8B by rl-research is an 8 billion parameter SFT (Supervised Fine-Tuning) checkpoint of DR Tulu, an open deep research agent built on Qwen3-8B. This model is specifically trained for tool-use within the dr-agent-lib framework, excelling in complex research-oriented question answering and information retrieval tasks. It demonstrates significant performance improvements over its base model in benchmarks like SQAv2, HealthBench, and DeepResearch Bench, making it suitable for advanced research applications requiring agentic capabilities.

Cold
Public
8B
FP8
32768
License: apache-2.0
Hugging Face

No reviews yet. Be the first to review!