ML-Master  by sjtu-sai-agents

AI agent for AI development and benchmarking

Created 6 months ago
323 stars

Top 84.3% on SourcePulse

GitHubView on GitHub
Project Summary

ML-Master is a novel AI-for-AI (AI4AI) agent that integrates exploration and reasoning into a coherent iterative framework. It addresses challenges in advanced AI development, targeting researchers and engineers in AutoML. ML-Master offers significant performance gains, achieving state-of-the-art results on the MLE-Bench leaderboard.

How It Works

The core AI4AI methodology deeply integrates exploration and reasoning via a coherent iterative process. An adaptive memory mechanism selectively captures and summarizes insights, ensuring exploration and reasoning components mutually reinforce without compromise, leading to more capable agents.

Quick Start & Requirements

  • Primary Install/Run: Installation is available via Docker (sjtuagents/ml-master:latest) or by cloning the repo and setting up a Python 3.12 Conda environment.
  • Prerequisites: Requires the MLE-Bench environment, a >2TB MLE-Bench dataset, and configured API keys for LLMs (DeepSeek, GPT-4o). Users must run launch_server.sh before run.sh.
  • Resource Footprint: The MLE-Bench dataset exceeds 2TB. Setup involves cloning, environment creation, dependency installation, and dataset preparation.
  • Links: Refer to MLE-Bench documentation for environment setup and dataset preparation.

Highlighted Details

  • ML-Master 2.0 achieved #1 on the MLE-Bench Leaderboard with 56.44% overall performance (+92.7% improvement).
  • Notable gains include +152.2% in Medium Complexity and +72.8% in High Complexity tasks.
  • Demonstrates strong runtime efficiency (12 hours, 50% budget).

Maintenance & Community

Supported by SJTU SAI with infrastructure from EigenAI. Recent updates include ML-Master 2.0, a feature-dev branch, and a Docker image. Community interaction is encouraged via a WeChat group.

Licensing & Compatibility

No specific open-source license is mentioned in the provided README content, requiring further investigation for commercial use or integration.

Limitations & Caveats

Using closed-source models as coding models may require disabling steerable reasoning (agent.steerable_reasoning=false), potentially degrading performance. Specific LLM API configurations are needed, including custom tag support. Handling the >2TB MLE-Bench dataset is a significant prerequisite.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
3
Star History
122 stars in the last 30 days

Explore Similar Projects

Starred by Bryan Helmig Bryan Helmig(Cofounder of Zapier) and Jared Palmer Jared Palmer(SVP at GitHub; Founder of Turborepo; Author of Formik, TSDX).

dspyground by Scale3-Labs

0.3%
291
Optimize AI agent prompts with DSPy GEPA
Created 3 months ago
Updated 1 month ago
Feedback? Help us improve.