autoresearch-mlx by trevin-creator

Autonomous AI research loops for Apple Silicon

Created 4 months ago

1,722 stars

Top 23.9% on SourcePulse

Project Summary

This project ports Andrej Karpathy's autonomous AI research loops to Apple Silicon using the MLX framework, eliminating the need for PyTorch or CUDA. It targets researchers and power users seeking automated, hardware-optimized model configuration discovery within strict time constraints, enabling efficient on-device AI experimentation.

How It Works

The system preserves Karpathy's core design: fixed 5-minute experiment cycles, a single mutable train.py script, a primary val_bpb metric, and Git for experiment management (commit/revert). By leveraging MLX, it achieves native execution on Apple Silicon, utilizing unified memory for efficient computation without external dependencies like PyTorch or CUDA. This approach facilitates rapid, iterative AI research directly on Mac hardware.

Quick Start & Requirements

Requirements: Apple Silicon Mac (M1/M2/M3/M4), Python 3.10+, uv package manager.

Installation:

curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync

Execution:

uv run prepare.py  # Data prep, tokenizer (one-time)
uv run train.py    # Run a single experiment (~7 min cycle)

Autonomous Research: Point an agent (e.g., Claude Code) at program.md.
Links: No explicit documentation or demo links provided.

Highlighted Details

Native Apple Silicon execution via MLX, removing PyTorch/CUDA dependencies.
Autonomous research loops with a strict 5-minute wall-clock budget per experiment.
Key finding: Apple Silicon's fixed-time throughput favors smaller, faster-training models (more optimizer steps) over larger parameter counts.
Demonstrates hardware-specific optimization, with different machines converging on distinct optimal configurations.
Includes an optional Muon optimizer, beneficial for memory-constrained hardware.

Maintenance & Community

The project acknowledges Andrej Karpathy for the core concept and references related MLX projects. No specific community channels (e.g., Discord, Slack) or detailed maintenance information are provided in the README.

Licensing & Compatibility

The project is released under the MIT License, preserving original copyright. This license generally permits commercial use and integration into closed-source projects.

Limitations & Caveats

Relies exclusively on MLX, which may have different performance characteristics than CUDA/PyTorch. MFU reporting is a placeholder; direct Apple Silicon FLOPs benchmarks are not equivalent to H100 standards. The evaluation token budget is reduced for faster iteration, potentially impacting full-scale evaluation. Muon optimizer's effectiveness is hardware-dependent.

Health Check

Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

75 stars in the last 30 days