rl-research/DR-Tulu-8B

DR-Tulu-8B is an 8 billion parameter deep research agent developed by rl-research, built upon the DR-Tulu-SFT-8B base model. This model has undergone Reinforcement Learning (RL) training specifically for advanced tool-use within the dr-agent-lib framework. It excels in complex research-oriented tasks, demonstrating superior performance across benchmarks like SQAv2, HealthBench, and DeepResearch Bench compared to its SFT counterpart and other 8B models.

Warm

Public

Model Size: 8B

Quant: FP8

Ctx length: 32768

License: apache-2.0

Hugging Face