AI-Scientist-v2  by SakanaAI

Agentic system for automated scientific discovery

created 3 months ago
1,468 stars

Top 28.5% on sourcepulse

GitHubView on GitHub
Project Summary

The AI Scientist-v2 is an end-to-end agentic system designed for autonomous scientific discovery, capable of generating hypotheses, conducting experiments, analyzing data, and writing scientific manuscripts. It targets researchers and developers interested in AI-driven scientific exploration, offering a generalized approach that removes reliance on human-authored templates and supports multiple LLM backends.

How It Works

The system employs a progressive agentic tree search, guided by an experiment manager agent. This approach allows for autonomous exploration of research avenues, hypothesis generation, and experimental design. Unlike its predecessor, v2 is generalized across ML domains and does not rely on fixed templates, enabling more open-ended scientific inquiry. The core advantage lies in its ability to autonomously navigate the scientific process, from ideation to manuscript generation, through a structured search mechanism.

Quick Start & Requirements

  • Installation: Requires a Conda environment with Python 3.11, PyTorch with CUDA 12.4 support, and PDF/LaTeX tools (Poppler, chktex). Install dependencies via pip install -r requirements.txt.
  • Models: Supports OpenAI, Gemini (via OpenAI API), and Claude models (via AWS Bedrock). Requires API keys for chosen models (e.g., OPENAI_API_KEY, GEMINI_API_KEY, AWS credentials for Bedrock). Semantic Scholar API key (S2_API_KEY) is optional for enhanced literature search.
  • Setup: Estimated setup time involves environment creation and dependency installation. Running experiments requires significant GPU resources, with costs around $15-20 per run for the experimentation phase using Claude 3.5 Sonnet.
  • Resources: Paper, Blog Post, ICLR2025 Workshop Experiment

Highlighted Details

  • Generated the first workshop paper written entirely by AI, accepted through peer review.
  • Autonomous hypothesis generation, experiment execution, data analysis, and manuscript writing.
  • Agentic tree search for guided scientific exploration.
  • Generalizes across ML domains without human-authored templates.

Maintenance & Community

  • Built on top of the AIDE project.
  • No explicit community links (Discord/Slack) or roadmap provided in the README.

Licensing & Compatibility

  • License not explicitly stated in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The system executes LLM-generated code, posing risks of dangerous packages or unintended processes; running in a sandbox is strongly recommended. Success rates for generating papers are lower than v1, especially without strong starting templates, due to v2's exploratory nature. CUDA Out of Memory errors can occur if GPU memory is insufficient.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
2
Issues (30d)
2
Star History
466 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.