codescientist  by allenai

Automated system for code-based scientific discovery

created 4 months ago
287 stars

Top 92.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

CodeScientist is an end-to-end system for automating scientific discovery through code-based experiments. It targets researchers and engineers who want to automate the design, execution, and analysis of experiments, leveraging LLMs to generate novel hypotheses and implement them via a robust experiment builder. The system aims to accelerate scientific progress by reducing the manual effort in experimental setup and iteration.

How It Works

CodeScientist employs a "genetic mutation" approach, using LLMs to mutate combinations of scientific articles and code examples to generate novel experiment ideas. These ideas are then realized by an "Experiment Builder" that automatically creates, runs, and debugs the experiment code within containers. The system supports both human-in-the-loop and fully-automatic modes, generating reports and meta-analyses of experimental outcomes.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create --name codescientist python=3.12), activate it, and install dependencies (pip install -r requirements.txt).
  • Prerequisites:
    • Python 3.12
    • Modal.com account for containerized experiment execution.
    • API keys for LLM providers (OpenAI, Anthropic, etc.) configured in api_keys.donotcommit.json.
    • LaTeX distribution (e.g., texlive-full on Ubuntu) for report generation.
  • Setup: Requires signing up for Modal.com and configuring API keys. Processing the paper corpus (for ideation) takes approximately 40 minutes.
  • Running: Start the backend server (python src/CodeScientistWebServer.py) and the frontend GUI (python src/CodeScientistWebInterface.py).
  • Documentation: Quick Start, Installation and Running, Usage.

Highlighted Details

  • Generates experiment ideas by mutating scientific papers and code examples using LLMs.
  • Automatically builds, runs, and debugs experiments in Modal.com containers.
  • Supports human-in-the-loop refinement of ideas and experiment plans.
  • Includes a "Hello World" example and a more complex LLM-based addition problem experiment.
  • Allows adding custom codeblocks to extend the system's capabilities.

Maintenance & Community

The project is from Allen Institute for AI (AI2). For questions, contact Peter Jansen (peterj@allenai.org). For issues, bugs, or feature requests, submit a GitHub issue.

Licensing & Compatibility

  • License: Apache 2.0 License.
  • Compatibility: Designed for Ubuntu 22.04 containers; may work with modifications on macOS and Windows.

Limitations & Caveats

  • LLM-generated code can occasionally fail or require manual intervention to debug.
  • Cost estimates for LLM calls are approximate and not foolproof; users must monitor API key usage and set hard limits.
  • PDF report generation may not work in all browsers (e.g., Firefox).
  • The system relies on Modal.com for container execution, which has associated costs and free tier limitations.
Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
48 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.