Discover and explore top open-source AI tools and projects—updated daily.
huggingfaceGitHub repos as verifiable RL environments for agent training
Top 70.2% on SourcePulse
Repo2RLEnv transforms GitHub repositories into verifiable Reinforcement Learning (RL) environments, enabling automated training and evaluation of AI agents for code-related tasks. It targets ML engineers and researchers by generating standardized, runnable datasets directly from source code, commits, and pull requests, streamlining the creation of reproducible AI development benchmarks.
How It Works
The project employs synthesis pipelines to read repositories and generate RL environments. These pipelines extract tasks with concrete, solvable objectives and programmatic rewards. Stable pipelines include pr_diff, which mines merged pull-request diffs into text-only tasks verified by LLM judges, and pr_runtime, which uses the repository's actual test suite within a Docker sandbox for robust evaluation. Outputs adhere to the Harbor format, ensuring seamless integration with compatible RL runtimes and agent harnesses without additional glue code.
Quick Start & Requirements
Installation options include uv add repo2rlenv, uvx repo2rlenv --help (one-shot), or pip install repo2rlenv. Authentication requires gh auth login and huggingface-cli login, or setting GITHUB_TOKEN and HF_TOKEN environment variables. LLM API keys (e.g., ANTHROPIC_API_KEY, OPENAI_API_KEY) are necessary for generation and agent execution. Docker is a prerequisite for sandbox-based pipelines. A full walkthrough is available at docs/quickstart.md.
Highlighted Details
pr_diff (text-based verification) and pr_runtime (test-suite execution).test_execution (repo tests) or diff_similarity (LLM-judged diff comparison) as reward signals.Maintenance & Community
The provided README does not detail specific contributors, sponsorships, or community channels (e.g., Discord, Slack).
Licensing & Compatibility
The project is licensed under Apache 2.0. While the datasets redistribute public commits for ML research under fair use, the original PR/commit contents remain under their respective source-repository licenses. This license is generally compatible with commercial use.
Limitations & Caveats
Four pipelines (commit_runtime, cve_patches, code_instruct, equivalence_tests) are designated as experimental, with potentially evolving interfaces and output quality. Setting up the Docker environment for sandbox pipelines can be resource-intensive, and LLM API costs are incurred during synthesis or verification steps.
6 days ago
Inactive
SWE-bench