ML-Master by sjtu-sai-agents

AI agent for AI development and benchmarking

Created 8 months ago

367 stars

Top 77.2% on SourcePulse

Project Summary

ML-Master is a novel AI-for-AI (AI4AI) agent that integrates exploration and reasoning into a coherent iterative framework. It addresses challenges in advanced AI development, targeting researchers and engineers in AutoML. ML-Master offers significant performance gains, achieving state-of-the-art results on the MLE-Bench leaderboard.

How It Works

The core AI4AI methodology deeply integrates exploration and reasoning via a coherent iterative process. An adaptive memory mechanism selectively captures and summarizes insights, ensuring exploration and reasoning components mutually reinforce without compromise, leading to more capable agents.

Quick Start & Requirements

Primary Install/Run: Installation is available via Docker (sjtuagents/ml-master:latest) or by cloning the repo and setting up a Python 3.12 Conda environment.
Prerequisites: Requires the MLE-Bench environment, a >2TB MLE-Bench dataset, and configured API keys for LLMs (DeepSeek, GPT-4o). Users must run launch_server.sh before run.sh.
Resource Footprint: The MLE-Bench dataset exceeds 2TB. Setup involves cloning, environment creation, dependency installation, and dataset preparation.
Links: Refer to MLE-Bench documentation for environment setup and dataset preparation.

Highlighted Details

ML-Master 2.0 achieved #1 on the MLE-Bench Leaderboard with 56.44% overall performance (+92.7% improvement).
Notable gains include +152.2% in Medium Complexity and +72.8% in High Complexity tasks.
Demonstrates strong runtime efficiency (12 hours, 50% budget).

Maintenance & Community

Supported by SJTU SAI with infrastructure from EigenAI. Recent updates include ML-Master 2.0, a feature-dev branch, and a Docker image. Community interaction is encouraged via a WeChat group.

Licensing & Compatibility

No specific open-source license is mentioned in the provided README content, requiring further investigation for commercial use or integration.

Limitations & Caveats

Using closed-source models as coding models may require disabling steerable reasoning (agent.steerable_reasoning=false), potentially degrading performance. Specific LLM API configurations are needed, including custom tag support. Handling the >2TB MLE-Bench dataset is a significant prerequisite.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

26 stars in the last 30 days