multiautoresearch  by burtenshaw

Autonomous AI research lab for automated experimentation

Created 1 month ago
290 stars

Top 90.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides an autonomous, open-source AI lab designed for researchers and power users to automate the process of paper research, experiment management, and GPU execution. It streamlines complex AI development workflows by leveraging a multi-agent system powered by OpenCode and Hermes, enabling self-contained, repeatable research cycles with minimal manual intervention.

How It Works

The core of the system comprises several agents (planner, experiment-worker, reviewer, etc.) orchestrated via OpenCode, which interpret instructions from AGENTS.md to execute research tasks. It utilizes train.py for experiment execution and prepare.py for benchmark setup, managing all runs and results in a local ledger (research/results.tsv). The workflow integrates deeply with Hugging Face infrastructure for GPU job execution and data/cache storage, automating experiment proposal, validation, execution, and result recording.

Quick Start & Requirements

Installation involves syncing dependencies with uv sync and authenticating with Hugging Face via hf auth login. Optional setup for Hermes is available. To start, users can launch Hermes profiles using uv run scripts/setup_hermes_profile.py and then initiate the autolab agent via autolab chat or start the OpenCode environment with opencode. The system relies on Hugging Face Jobs for running experiments and Hugging Face buckets for storage.

Highlighted Details

  • Implements a fully autonomous research loop, from paper review and experiment proposal to execution and result logging.
  • Leverages a suite of specialized OpenCode agents (autolab, planner, experiment-worker, reviewer, memory-keeper, researcher, reporter) for task delegation and execution.
  • Integrates with Hugging Face Jobs for scalable GPU compute and Hugging Face buckets for persistent data and cache management.
  • Automates experiment promotion based on performance metrics (val_bpb) against the current local master.

Maintenance & Community

The provided README does not contain specific details regarding maintainers, community channels (e.g., Discord, Slack), or a public roadmap.

Licensing & Compatibility

The README does not specify a software license. Therefore, compatibility for commercial use or closed-source linking cannot be determined without further clarification.

Limitations & Caveats

The operating model mandates strict adherence to editing only train.py for hypothesis changes and performing exactly one change per run. The system's reliance on Hugging Face infrastructure implies potential vendor lock-in and requires active management of Hugging Face accounts and resources. The absence of explicit licensing information poses a significant adoption blocker for commercial or sensitive projects.

Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
148 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.