Discover and explore top open-source AI tools and projects—updated daily.
ZiemsFramework for optimizing DSPy programs with RL
Top 96.7% on SourcePulse
Summary
Arbor is a framework designed to optimize DSPy programs using Reinforcement Learning (RL). It targets developers seeking to enhance the performance and efficiency of their language model programs by automating the fine-tuning process. The primary benefit is achieving superior program outputs through advanced RL techniques.
How It Works
Arbor employs a Generalized Proximal Policy Optimization (GRPO) approach to fine-tune DSPy language models. It integrates with DSPy via a custom ArborProvider, allowing an RL agent to iteratively improve program prompts and parameters. The framework uses a defined reward function to guide the optimization process, aiming to discover more effective program configurations than manual tuning or standard fine-tuning. Parameter-efficient fine-tuning via LoRA is supported.
Quick Start & Requirements
uv pip install -U arbor-ai or pip install -U arbor-ai. For the latest DSPy features, install from source: uv pip install -U git+https://github.com/stanfordnlp/dspy.git@main.nvcc must be installed. flash-attn can be optionally installed for accelerated inference, but its installation may take over 15 minutes.Highlighted Details
flash-attn for potential inference speedups.Maintenance & Community
The project acknowledges contributions from Will Brown's Verifiers library and the Hugging Face TRL library. Community support is available via dedicated Discord servers for Arbor and DSPy. No specific maintainer information, sponsorship details, or roadmap links are provided in the README.
The following research papers are cited as foundational work:
@article{ziems2025multi,
title={Multi-module GRPO: Composing policy gradients and prompt optimization for language model programs},
author={Ziems, Noah and Soylu, Dilara and Agrawal, Lakshya A and Miller, Isaac and Lai, Liheng and Qian, Chen and Song, Kaiqiang and Jiang, Meng and Klein, Dan and Zaharia, Matei and others},
journal={arXiv preprint arXiv:2508.04660},
year={2025}
}
@article{agrawal2025gepa,
title={GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning},
author={Agrawal, Lakshya A and Tan, Shangyin and Soylu, Dilara and Ziems, Noah and Khare, Rishi and Opsahl-Ong, Krista and Singhvi, Arnav and Shandilya, Herumb and Ryan, Michael J and Jiang, Meng and others},
journal={arXiv preprint arXiv:2507.19457},
year={2025}
}
Licensing & Compatibility
The license type is not specified in the provided README content. Compatibility is primarily with DSPy and requires specific hardware (multi-GPU) and software (CUDA, nvcc) configurations.
Limitations & Caveats
Potential NCCL errors may require specific environment variable configurations (NCCL_P2P_DISABLE=1, NCCL_IB_DISABLE=1) for stability on certain GPU setups. The installation of optional dependencies like flash-attn can be time-consuming. Training is inherently resource-intensive due to its multi-GPU requirements.
1 day ago
Inactive
hiyouga
EleutherAI
hpcaitech