AI-Researcher  by NoviScl

Research ideation agent for NLP, based on a Stanford NLP paper

Created 1 year ago
350 stars

Top 79.5% on SourcePulse

GitHubView on GitHub
Project Summary

This repository implements an AI research ideation agent that generates novel, detailed project proposals based on natural language research topics. Targeted at researchers and students, it aims to produce ideas rated as more novel than those from human experts, as validated in a Stanford NLP study.

How It Works

The agent operates as a sequential pipeline of six modules: related paper search, grounded idea generation, idea deduplication, project proposal generation, project proposal ranking, and optional project proposal filtering. It leverages LLMs for relevance scoring, idea generation, and proposal ranking, with retrieval augmentation grounding generation on relevant papers found via Semantic Scholar. Deduplication uses sentence-transformer embeddings and cosine similarity.

Quick Start & Requirements

  • Install: Clone repo, create and activate a conda environment (conda create -n ai-researcher python=3.10, conda activate ai-researcher), then pip install -r requirements.txt.
  • Prerequisites: OpenAI API Key, optionally Semantic Scholar and Anthropic API keys stored in keys.json.
  • Demo Cost: The demo example costs approximately $5.00 in API calls.
  • Docs: Setup, Modules, End-to-End

Highlighted Details

  • Ideas generated by the agent were rated as more novel than those from human experts by 79 reviewers.
  • Proposals are detailed enough for direct execution by students.
  • Modules can be run independently as standalone research assistance tools.
  • Includes scripts for statistical tests and released review scores.

Maintenance & Community

  • Developed by Stanford NLP researchers.
  • Contact: clsi@stanford.edu or open an issue.
  • Citation: Si et al., "Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers" (arXiv, 2024).

Licensing & Compatibility

  • License: Not explicitly stated in the README.
  • Compatibility: Requires API keys for OpenAI, Semantic Scholar, and Anthropic, implying potential costs and usage restrictions.

Limitations & Caveats

The full set of AI-generated project proposals is not released to avoid bias in ongoing studies. The novelty check in the filtering module is noted as "rather expensive."

Health Check
Last Commit

1 month ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
1
Star History
10 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Eugene Yan Eugene Yan(AI Scientist at AWS), and
1 more.

obsidian-copilot by eugeneyan

0.2%
553
Prototype assistant for writing and thinking
Created 2 years ago
Updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Casper Hansen Casper Hansen(Author of AutoAWQ), and
8 more.

storm by stanford-oval

0.2%
27k
LLM system for automated knowledge curation and article generation
Created 1 year ago
Updated 2 months ago
Feedback? Help us improve.