AI-Researcher  by NoviScl

Research ideation agent for NLP, based on a Stanford NLP paper

created 1 year ago
335 stars

Top 83.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository implements an AI research ideation agent that generates novel, detailed project proposals based on natural language research topics. Targeted at researchers and students, it aims to produce ideas rated as more novel than those from human experts, as validated in a Stanford NLP study.

How It Works

The agent operates as a sequential pipeline of six modules: related paper search, grounded idea generation, idea deduplication, project proposal generation, project proposal ranking, and optional project proposal filtering. It leverages LLMs for relevance scoring, idea generation, and proposal ranking, with retrieval augmentation grounding generation on relevant papers found via Semantic Scholar. Deduplication uses sentence-transformer embeddings and cosine similarity.

Quick Start & Requirements

  • Install: Clone repo, create and activate a conda environment (conda create -n ai-researcher python=3.10, conda activate ai-researcher), then pip install -r requirements.txt.
  • Prerequisites: OpenAI API Key, optionally Semantic Scholar and Anthropic API keys stored in keys.json.
  • Demo Cost: The demo example costs approximately $5.00 in API calls.
  • Docs: Setup, Modules, End-to-End

Highlighted Details

  • Ideas generated by the agent were rated as more novel than those from human experts by 79 reviewers.
  • Proposals are detailed enough for direct execution by students.
  • Modules can be run independently as standalone research assistance tools.
  • Includes scripts for statistical tests and released review scores.

Maintenance & Community

  • Developed by Stanford NLP researchers.
  • Contact: clsi@stanford.edu or open an issue.
  • Citation: Si et al., "Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers" (arXiv, 2024).

Licensing & Compatibility

  • License: Not explicitly stated in the README.
  • Compatibility: Requires API keys for OpenAI, Semantic Scholar, and Anthropic, implying potential costs and usage restrictions.

Limitations & Caveats

The full set of AI-generated project proposals is not released to avoid bias in ongoing studies. The novelty check in the filtering module is noted as "rather expensive."

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
29 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), John Yang John Yang(Author of SWE-bench, SWE-agent), and
7 more.

tree-of-thought-llm by princeton-nlp

0.3%
5k
Research paper implementation for Tree of Thoughts (ToT) prompting
created 2 years ago
updated 6 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Teknium Teknium(Cofounder of Nous Research), and
3 more.

storm by stanford-oval

0.4%
27k
LLM system for automated knowledge curation and article generation
created 1 year ago
updated 1 month ago
Feedback? Help us improve.