AutoSurvey by AutoSurveys

Framework for automated literature surveys (NeurIPS 2024 paper)

Created 1 year ago

459 stars

Top 65.9% on SourcePulse

Project Summary

AutoSurvey provides an automated framework for generating comprehensive literature surveys using large language models. It is designed for researchers and academics seeking to streamline the process of synthesizing existing research on a given topic, offering high citation and content quality.

How It Works

AutoSurvey leverages LLMs to automate survey creation through a structured process. It utilizes a Retrieval-Augmented Generation (RAG) approach, incorporating a large database of arXiv paper abstracts to inform the generation. Key parameters allow control over survey length, section structure, and the number of references used for both outline generation and RAG, enabling tailored and contextually rich survey outputs.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 3.10.x, a database of arXiv paper abstracts (provided via a OneDrive link), an OpenAI API key, and a GPU.
Setup: Requires cloning the repository, installing dependencies, and downloading/unzipping the database.
Docs: NeurIPS 2024 Paper

Highlighted Details

Demonstrated high citation and content quality across various survey lengths (8k, 16k, 32k, 64k tokens).
Supports multiple LLMs, including gpt-4o-2024-05-13.
Utilizes nomic-ai/nomic-embed-text-v1 for embedding.
Includes an evaluation script to assess generated surveys.

Maintenance & Community

The project is associated with authors from Westlake University, Peking University, Nanjing University, Harbin Institute of Technology (Shenzhen), and Squirrel AI. Contributions are welcome via GitHub issues.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The framework requires access to an OpenAI API key and relies on a specific database of arXiv abstracts, which may not cover all research domains. The quality of the generated survey is dependent on the chosen LLM and the quality of the underlying data.

AutoSurvey by AutoSurveys

Explore Similar Projects

SurveyForge by InternScience

AI-Researcher by NoviScl

open_deep_research by togethercomputer

ai2-scholarqa-lib by allenai

ALCE by princeton-nlp

FLARE by jzbjyb

SurveyX by IAAR-Shanghai

RAG-Survey by Tongji-KGLLM

paper-ai by 14790897

paper-qa by Future-House

pdfGPT by bhaskatripathi

local-deep-researcher by langchain-ai