Repo2RLEnv  by huggingface

GitHub repos as verifiable RL environments for agent training

Created 1 month ago
415 stars

Top 70.2% on SourcePulse

GitHubView on GitHub
Project Summary

Repo2RLEnv transforms GitHub repositories into verifiable Reinforcement Learning (RL) environments, enabling automated training and evaluation of AI agents for code-related tasks. It targets ML engineers and researchers by generating standardized, runnable datasets directly from source code, commits, and pull requests, streamlining the creation of reproducible AI development benchmarks.

How It Works

The project employs synthesis pipelines to read repositories and generate RL environments. These pipelines extract tasks with concrete, solvable objectives and programmatic rewards. Stable pipelines include pr_diff, which mines merged pull-request diffs into text-only tasks verified by LLM judges, and pr_runtime, which uses the repository's actual test suite within a Docker sandbox for robust evaluation. Outputs adhere to the Harbor format, ensuring seamless integration with compatible RL runtimes and agent harnesses without additional glue code.

Quick Start & Requirements

Installation options include uv add repo2rlenv, uvx repo2rlenv --help (one-shot), or pip install repo2rlenv. Authentication requires gh auth login and huggingface-cli login, or setting GITHUB_TOKEN and HF_TOKEN environment variables. LLM API keys (e.g., ANTHROPIC_API_KEY, OPENAI_API_KEY) are necessary for generation and agent execution. Docker is a prerequisite for sandbox-based pipelines. A full walkthrough is available at docs/quickstart.md.

Highlighted Details

  • Features two stable pipelines: pr_diff (text-based verification) and pr_runtime (test-suite execution).
  • Generates verifiable tasks using test_execution (repo tests) or diff_similarity (LLM-judged diff comparison) as reward signals.
  • Natively publishes datasets to the Hugging Face Hub in Harbor format for direct consumption.
  • Supports private repositories with automated authentication resolution.
  • Employs content-addressing for dataset artifacts, ensuring reproducibility.

Maintenance & Community

The provided README does not detail specific contributors, sponsorships, or community channels (e.g., Discord, Slack).

Licensing & Compatibility

The project is licensed under Apache 2.0. While the datasets redistribute public commits for ML research under fair use, the original PR/commit contents remain under their respective source-repository licenses. This license is generally compatible with commercial use.

Limitations & Caveats

Four pipelines (commit_runtime, cve_patches, code_instruct, equivalence_tests) are designated as experimental, with potentially evolving interfaces and output quality. Setting up the Docker environment for sandbox pipelines can be resource-intensive, and LLM API costs are incurred during synthesis or verification steps.

Health Check
Last Commit

6 days ago

Responsiveness

Inactive

Pull Requests (30d)
23
Issues (30d)
8
Star History
409 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.