Discover and explore top open-source AI tools and projects—updated daily.
Toolkit for training software engineering agents
Top 74.4% on SourcePulse
SWE-smith is a toolkit for generating synthetic data and environments to train Software Engineering (SWE) agents. It enables users to transform GitHub repositories into "SWE-gyms," create diverse tasks like program repair, and fine-tune large language models for improved software development performance. The primary benefit is the ability to scale data generation for robust SWE agent training.
How It Works
SWE-smith leverages Docker to create isolated execution environments for each GitHub repository. It synthesizes task instances by identifying code changes that break unit tests, generating corresponding issue text. This process allows for the creation of large, diverse datasets and environments specifically designed for training and evaluating SWE agents.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
3 days ago
1 week