codescientist  by allenai

Automated system for code-based scientific discovery

Created 6 months ago
289 stars

Top 91.0% on SourcePulse

GitHubView on GitHub
Project Summary

CodeScientist is an end-to-end system for automating scientific discovery through code-based experiments. It targets researchers and engineers who want to automate the design, execution, and analysis of experiments, leveraging LLMs to generate novel hypotheses and implement them via a robust experiment builder. The system aims to accelerate scientific progress by reducing the manual effort in experimental setup and iteration.

How It Works

CodeScientist employs a "genetic mutation" approach, using LLMs to mutate combinations of scientific articles and code examples to generate novel experiment ideas. These ideas are then realized by an "Experiment Builder" that automatically creates, runs, and debugs the experiment code within containers. The system supports both human-in-the-loop and fully-automatic modes, generating reports and meta-analyses of experimental outcomes.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create --name codescientist python=3.12), activate it, and install dependencies (pip install -r requirements.txt).
  • Prerequisites:
    • Python 3.12
    • Modal.com account for containerized experiment execution.
    • API keys for LLM providers (OpenAI, Anthropic, etc.) configured in api_keys.donotcommit.json.
    • LaTeX distribution (e.g., texlive-full on Ubuntu) for report generation.
  • Setup: Requires signing up for Modal.com and configuring API keys. Processing the paper corpus (for ideation) takes approximately 40 minutes.
  • Running: Start the backend server (python src/CodeScientistWebServer.py) and the frontend GUI (python src/CodeScientistWebInterface.py).
  • Documentation: Quick Start, Installation and Running, Usage.

Highlighted Details

  • Generates experiment ideas by mutating scientific papers and code examples using LLMs.
  • Automatically builds, runs, and debugs experiments in Modal.com containers.
  • Supports human-in-the-loop refinement of ideas and experiment plans.
  • Includes a "Hello World" example and a more complex LLM-based addition problem experiment.
  • Allows adding custom codeblocks to extend the system's capabilities.

Maintenance & Community

The project is from Allen Institute for AI (AI2). For questions, contact Peter Jansen (peterj@allenai.org). For issues, bugs, or feature requests, submit a GitHub issue.

Licensing & Compatibility

  • License: Apache 2.0 License.
  • Compatibility: Designed for Ubuntu 22.04 containers; may work with modifications on macOS and Windows.

Limitations & Caveats

  • LLM-generated code can occasionally fail or require manual intervention to debug.
  • Cost estimates for LLM calls are approximate and not foolproof; users must monitor API key usage and set hard limits.
  • PDF report generation may not work in all browsers (e.g., Firefox).
  • The system relies on Modal.com for container execution, which has associated costs and free tier limitations.
Health Check
Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research) and Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA).

DS-1000 by xlang-ai

0.4%
256
Benchmark for data science code generation
Created 2 years ago
Updated 10 months ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Travis Fischer Travis Fischer(Founder of Agentic), and
6 more.

AlphaCodium by Codium-ai

0.1%
4k
Code generation research paper implementation
Created 1 year ago
Updated 9 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
4 more.

Biomni by snap-stanford

1.3%
2k
Biomedical AI agent for autonomous research tasks
Created 6 months ago
Updated 1 day ago
Feedback? Help us improve.