aideml  by WecoAI

ML engineering agent for automated AI R&D, surpassing human experts

created 1 year ago
969 stars

Top 38.8% on sourcepulse

GitHubView on GitHub
Project Summary

AIDE is an LLM-powered agent designed to automate machine learning R&D by generating, testing, and refining Python code from natural language descriptions of ML tasks. It targets ML engineers and data scientists seeking to accelerate research and development cycles, offering automated experimentation and code generation.

How It Works

AIDE employs a "Solution Space Tree Search" methodology. It begins by generating initial solution drafts, then iteratively refines them based on performance feedback. This involves a Solution Generator for creating or modifying code, an Evaluator that runs solutions and parses logs for metrics, and a Base Solution Selector to choose the most promising candidate for the next iteration. This approach allows AIDE to navigate the solution space and converge on optimal ML code.

Quick Start & Requirements

  • Install: pip install -U aideml
  • Prerequisites: Python >= 3.10, unzip utility, OpenAI or Anthropic API key.
  • Web UI: Navigate to aide/webui and run streamlit run app.py.
  • CLI: aide data_dir="<path>" goal="<description>" eval="<metric>"
  • Local LLMs: Supports local LLMs via OpenAI-compatible APIs (e.g., Ollama) by setting OPENAI_BASE_URL.
  • Docker: Build image with docker build -t aide . and run with mounted volumes for logs/workspaces.
  • Docs: Paper, Blog

Highlighted Details

  • Outperforms 50% of Kaggle participants on average in a benchmark of 60 competitions.
  • Achieved 4x more medals than the runner-up in OpenAI's MLE-bench (75 Kaggle tasks).
  • Generalizes to AI R&D tasks like Triton kernel optimization and GPT-2 fine-tuning, surpassing human experts in METR's RE-Bench.
  • Provides iterative optimization, debugging, evaluation, and visualization of the solution tree.

Maintenance & Community

  • Project is actively developed by WecoAI.
  • Contribution guide is noted as "coming soon."

Licensing & Compatibility

  • The README does not explicitly state a license. The repository structure suggests it might be Apache 2.0, but this requires verification.

Limitations & Caveats

  • The project is presented with a 2025 paper citation, suggesting it may be a research prototype or have undergone recent significant changes.
  • No explicit mention of supported operating systems beyond general Python/Docker compatibility.
Health Check
Last commit

6 days ago

Responsiveness

1 day

Pull Requests (30d)
5
Issues (30d)
5
Star History
104 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.