Environment for training software engineering agents
Top 61.8% on sourcepulse
SWE-Gym provides an open environment for training and evaluating software engineering agents and verifiers, addressing the limitations of existing benchmarks by incorporating rigorous verification and real-world repository tasks. It is designed for researchers and developers working on AI for software development, enabling scalable improvements in agent performance.
How It Works
SWE-Gym integrates real-world Python tasks sourced from 11 repositories, providing executable environments and test verification. This approach allows for the training of Large Language Models (LLMs) as agents, capable of interacting with the environment, generating code, and receiving feedback through test results. The environment supports self-improvement through rejection sampling fine-tuning and enables inference-time scaling via learned verifiers that select the most promising solutions.
Quick Start & Requirements
xingyaoww/sweb.eval.x86_64
prefix on Docker Hub.Highlighted Details
Maintenance & Community
The project is associated with researchers from UC Berkeley, UIUC, CMU, and Apple. Further community engagement details are not explicitly provided in the README.
Licensing & Compatibility
The README does not explicitly state the license. The project is presented as open-source, but specific licensing terms for commercial use or closed-source linking are not detailed.
Limitations & Caveats
The current results are primarily bottlenecked by training and inference compute. While promising scaling trends are observed, further improvements are dependent on increased computational resources. The project is presented in the context of an ICML 2025 paper, suggesting it is a recent development.
4 days ago
1 week