annotateai by neuml

CLI tool for automated paper annotation using LLMs

Created 1 year ago

397 stars

Top 72.7% on SourcePulse

Project Summary

This project provides an automated system for annotating research papers using Large Language Models (LLMs). It aims to enhance the reading experience by offering concise topic summaries and highlighting key sections within documents, particularly benefiting researchers and students working with scientific literature.

How It Works

The system processes PDF documents by identifying the paper's title and key concepts. It then iterates through each page, locating sections that best exemplify these concepts. For each relevant section, it generates a brief topic summary, effectively annotating the paper with contextual information. This approach leverages LLMs to distill complex information and provide targeted insights during the reading process.

Quick Start & Requirements

Install via pip: pip install annotateai
Python 3.10+ required.
For local LLM execution, autoawq[kernels] or llama-cpp-python may be needed depending on the model and OS.
Supports various LLMs, including API-based models (GPT-4o, Claude 3.5 Sonnet), Ollama endpoints, and Hugging Face GGUF models.
Docker image available: docker run -d --gpus=all -it -p 8501:8501 neuml/annotateai
Official documentation: Introducing AnnotateAI

Highlighted Details

Works with any PDF, with optimized performance for medical and scientific papers from sources like arXiv, PubMed, bioRxiv, and medRxiv.
Supports custom keyword input for targeted annotation.
Offers a Dockerized web application for easy deployment.
Integrates with the txtai library, supporting a wide range of LLMs.

Maintenance & Community

The project is maintained by NeuML. Further details on community engagement or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is primarily focused on PDF documents and may not support other formats. Specific LLM compatibility and performance can vary. The README does not detail any known bugs or deprecation plans.

annotateai by neuml

Explore Similar Projects

zotero-better-plugin by jackhanyuan

ScholarXIV by dagmawibabi

Text-Summarization-Repo by uoneway

topicGPT by chtmp223

ChatGPT-Paper-Reader by talkingwallace

OpenContracts by Open-Source-Legal

Text-Analytics by pilsung-kang

LDAvis by cpsievert

llmsherpa by nlmatics

annotated_research_papers by AakashKumarNain

daily-arXiv-ai-enhanced by dw-dengwei

ChatPaper by kaixindelele