PaperForge by QJHWC

Automated research pipeline for academic paper generation

Created 4 months ago

588 stars

Top 54.6% on SourcePulse

Project Summary

Summary

PaperForge is an end-to-end AI-powered system designed to automate the academic paper writing process. It targets researchers and students by streamlining idea generation, literature search, experiment execution, result backfilling, and LaTeX compilation, offering a significant benefit in accelerating research output and reducing manual effort.

How It Works

The system orchestrates a single Agent loop that connects idea generation, experimental coding, cloud training, and LaTeX paper writing. It supports multiple LLM backends (Anthropic, OpenAI, Gemini, DeepSeek) and offers two primary workflows: 'Scientist' for fully automated end-to-end generation and 'MVP' for a staged, iterative approach. Key components include LLM clients, workflow orchestrators, and MCP (Multi-purpose Control Plane) tools for literature search, LaTeX compilation, and diagram generation.

Quick Start & Requirements

Installation: Requires Python 3.11. Setup involves creating a virtual environment (python3.11 -m venv .venv311), activating it (source .venv311/bin/activate), and installing dependencies (pip install -r requirements.txt).
Configuration: API keys for supported LLMs (Anthropic, OpenAI, OpenAlex, Semantic Scholar) must be configured by copying and editing key.example.sh to key.sh and sourcing it.
Execution:
- MVP workflow: python launch_mvp_workflow.py --phase all --experiment paper_writer --idea-name "My Research Idea" --engine openalex
- Scientist workflow: python launch_scientist.py --experiment paper_writer --num-ideas 1 --skip-novelty-check
Prerequisites: Python 3.11, API keys for LLMs and literature services. Environment variables like ANTHROPIC_API_KEY, OPENAI_API_KEY, OPENALEX_MAIL_ADDRESS are crucial. MPLBACKEND=Agg is recommended for headless Matplotlib on macOS.
Docs/Demo: No explicit links provided in the README for quick-start or demo pages, but the console commands offer ways to inspect the system.

Highlighted Details

Supports multi-LLM routing (Anthropic, OpenAI, Gemini, DeepSeek).
Features SSH remote training for cloud-based experiment execution.
Includes incremental sync capabilities.
Offers an anti-AI-detection writing style with specific skills for de-AI-generation and citation management.
Integrates tools for literature search (OpenAlex, Semantic Scholar), LaTeX compilation, and diagram generation.
Provides a frontend console (frontend.console) for monitoring workspace status, prompts, and automatic refreshing.

Maintenance & Community

No specific details regarding contributors, sponsorships, community channels (like Discord/Slack), or roadmaps are present in the provided README text.

Licensing & Compatibility

License: Not explicitly stated as a standard OSI license, but usage is heavily restricted.
Restrictions: Strictly prohibited for commercial use. Permitted only for personal, academic research, and non-profit educational purposes. Use for surveillance, deceptive media, or unauthorized medical/criminal prediction is forbidden.
Compatibility: Requires explicit declaration of AI assistance for any generated papers.

Limitations & Caveats

The primary limitation is the strict non-commercial usage clause, restricting adoption to academic and personal research contexts. Generated papers must be clearly marked as AI-assisted, which may have implications for publication venues or academic integrity policies. The system's reliance on multiple external APIs also introduces potential points of failure or rate-limiting issues.

Health Check

Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

22 stars in the last 30 days