rl-research/DR-Tulu-8B
DR-Tulu-8B is an 8 billion parameter deep research agent developed by rl-research, built upon the DR-Tulu-SFT-8B base model. This model has undergone Reinforcement Learning (RL) training specifically for advanced tool-use within the dr-agent-lib framework. It excels in complex research-oriented tasks, demonstrating superior performance across benchmarks like SQAv2, HealthBench, and DeepResearch Bench compared to its SFT counterpart and other 8B models.
No reviews yet. Be the first to review!