Auto-claude-code-research-in-sleep by wanshuiyin

AI-driven autonomous ML research framework

Created 1 month ago

6,066 stars

Top 8.3% on SourcePulse

Project Summary

ARIS ⚔️ (Auto-Research-In-Sleep) provides an autonomous framework for machine learning research, automating complex workflows like literature review, idea generation, experiment execution, and paper writing. It targets researchers and power users, aiming to accelerate research cycles and improve output quality through overnight automated processes, enabling significant gains in productivity and paper refinement.

How It Works

ARIS orchestrates cross-model collaboration, leveraging Claude Code for task execution (writing code, running experiments) and an external LLM, typically GPT-5.4 via Codex MCP, for critical review. This adversarial approach, contrasting with single-model self-play, actively probes for weaknesses, leading to more rigorous outcomes. The system capitalizes on complementary strengths: Claude Code's speed and fluidity paired with Codex's deliberate and rigorous critique, creating a robust feedback loop for research refinement.

Quick Start & Requirements

Installation involves cloning the repository and copying skills to ~/.claude/skills/. Key prerequisites include having Claude Code installed and the Codex CLI configured as an MCP server (npm install -g @openai/codex, claude mcp add codex -s user -- codex mcp-server). Depending on the chosen model combination, API keys for services like OpenAI or others may be necessary. A detailed setup guide is provided.

Highlighted Details

Automated Review Loop: Features a 4-round autonomous review process that demonstrably improves research paper scores overnight (e.g., 5.0/10 to 7.5/10).
Idea Discovery Pipeline: Automates literature surveying, idea brainstorming, novelty checks, and GPU-based pilot experiments.
Cross-Model Collaboration: Supports flexible combinations like Claude + GPT-5.4 (default), GLM + GPT, or GLM + MiniMax, reducing API dependency.
GPU Deployment: Enables automated experiment deployment, multi-GPU execution, and live monitoring.
Composable Skills: Offers 15 modular skills for comprehensive research pipelines, from idea generation to paper submission.

Maintenance & Community

Active development includes planned integrations for Feishu/Lark, W&B, Zotero, and Obsidian. A WeChat group facilitates community discussion on AI-driven research workflows.

Licensing & Compatibility

Released under the MIT license, it is permissive for commercial use and integration. The architecture supports various executor and reviewer model combinations, enhancing flexibility.

Limitations & Caveats

The system accelerates research but requires human critical oversight for final decisions. Auto-generated figures are limited to data plots; complex diagrams need manual creation. Alternative model configurations may require prompt tuning. Automated experiments necessitate GPU server setup.

Health Check

Last Commit

23 hours ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

5,898 stars in the last 30 days