arbor  by Ziems

Framework for optimizing DSPy programs with RL

Created 8 months ago
264 stars

Top 96.7% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Arbor is a framework designed to optimize DSPy programs using Reinforcement Learning (RL). It targets developers seeking to enhance the performance and efficiency of their language model programs by automating the fine-tuning process. The primary benefit is achieving superior program outputs through advanced RL techniques.

How It Works

Arbor employs a Generalized Proximal Policy Optimization (GRPO) approach to fine-tune DSPy language models. It integrates with DSPy via a custom ArborProvider, allowing an RL agent to iteratively improve program prompts and parameters. The framework uses a defined reward function to guide the optimization process, aiming to discover more effective program configurations than manual tuning or standard fine-tuning. Parameter-efficient fine-tuning via LoRA is supported.

Quick Start & Requirements

  • Installation: Install via uv pip install -U arbor-ai or pip install -U arbor-ai. For the latest DSPy features, install from source: uv pip install -U git+https://github.com/stanfordnlp/dspy.git@main.
  • Prerequisites: Requires a multi-GPU setup for training (e.g., 3-4 GPUs recommended). CUDA and nvcc must be installed. flash-attn can be optionally installed for accelerated inference, but its installation may take over 15 minutes.
  • Resources: Training is resource-intensive, necessitating significant GPU compute.
  • Community: Join the Arbor Discord or DSPy Discord for support and discussions.

Highlighted Details

  • Leverages GRPO for RL-based optimization of DSPy programs.
  • Supports LoRA for efficient fine-tuning of large language models.
  • Integrates flash-attn for potential inference speedups.
  • Provides a clear Python API for defining tasks, reward functions, and initiating the compilation/optimization process.

Maintenance & Community

The project acknowledges contributions from Will Brown's Verifiers library and the Hugging Face TRL library. Community support is available via dedicated Discord servers for Arbor and DSPy. No specific maintainer information, sponsorship details, or roadmap links are provided in the README.

The following research papers are cited as foundational work:

@article{ziems2025multi,
  title={Multi-module GRPO: Composing policy gradients and prompt optimization for language model programs},
  author={Ziems, Noah and Soylu, Dilara and Agrawal, Lakshya A and Miller, Isaac and Lai, Liheng and Qian, Chen and Song, Kaiqiang and Jiang, Meng and Klein, Dan and Zaharia, Matei and others},
  journal={arXiv preprint arXiv:2508.04660},
  year={2025}
}
@article{agrawal2025gepa,
  title={GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning},
  author={Agrawal, Lakshya A and Tan, Shangyin and Soylu, Dilara and Ziems, Noah and Khare, Rishi and Opsahl-Ong, Krista and Singhvi, Arnav and Shandilya, Herumb and Ryan, Michael J and Jiang, Meng and others},
  journal={arXiv preprint arXiv:2507.19457},
  year={2025}
}

Licensing & Compatibility

The license type is not specified in the provided README content. Compatibility is primarily with DSPy and requires specific hardware (multi-GPU) and software (CUDA, nvcc) configurations.

Limitations & Caveats

Potential NCCL errors may require specific environment variable configurations (NCCL_P2P_DISABLE=1, NCCL_IB_DISABLE=1) for stability on certain GPU setups. The installation of optional dependencies like flash-attn can be time-consuming. Training is inherently resource-intensive due to its multi-GPU requirements.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
33
Issues (30d)
7
Star History
80 stars in the last 30 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
25 more.

gpt-neox by EleutherAI

0.1%
7k
Framework for training large-scale autoregressive language models
Created 4 years ago
Updated 1 month ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
27 more.

ColossalAI by hpcaitech

0.0%
41k
AI system for large-scale parallel training
Created 4 years ago
Updated 3 weeks ago
Feedback? Help us improve.