dr-tulu  by rlresearch

Reinforcement learning agents for deep research tasks

Created 1 month ago
508 stars

Top 61.5% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

DR Tulu addresses the need for open-source models capable of long-form Deep Research (DR) tasks. It provides the DR Tulu-8B model, which matches OpenAI's DR performance on relevant benchmarks. This project is aimed at researchers and developers seeking advanced AI capabilities for complex, extended research endeavors, offering a high-performing, open alternative.

How It Works

The project comprises three core components: an agent/ library (dr-agent-lib) featuring an MCP-based tool backend, high-concurrency asynchronous request management, and a flexible prompting interface; rl/ code for training agents using GRPO and evolving rubrics, built upon Open-Instruct; and sft/ code for supervised fine-tuning, leveraging LLaMA-Factory. This modular design supports both foundational training and agent development for deep research applications.

Quick Start & Requirements

This is an initial code release with ongoing efforts to clean and expand instructions. Detailed setup and usage guides are available within each subdirectory's README files. Links to official resources include the Paper, Data & Models, Blogpost, Video, and a Static Demo. A live demo is forthcoming. Specific installation commands and prerequisites are not detailed in this overview.

Highlighted Details

  • DR Tulu-8B is presented as the first open Deep Research (DR) model specifically trained for long-form DR tasks.
  • The model achieves performance parity with OpenAI's DR on established long-form DR benchmarks.
  • The agent library incorporates an MCP-based tool backend and advanced asynchronous request management for efficient operation.
  • Training methodologies include Reinforcement Learning with GRPO and evolving rubrics, alongside supervised fine-tuning.

Maintenance & Community

DR Tulu is a project from The Allen Institute for Artificial Intelligence (Ai2), developed in collaboration with researchers from the University of Washington, Carnegie Mellon University, and MIT. Direct contact is available via Rulin Shao, Akari Asai, Shannon Shen, and Hamish Ivison, with GitHub issues also supported. No specific community channels like Discord or Slack are listed in this README.

Licensing & Compatibility

The license type and any compatibility notes for commercial or closed-source use are not specified in the provided README content.

Limitations & Caveats

This represents an initial code release, with ongoing work to refine and document the codebase. The live demo is still under development and not yet available. Users should anticipate that instructions and code may evolve as development progresses.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
6
Issues (30d)
0
Star History
51 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.