LLM agent for academic paper search
Top 32.0% on sourcepulse
PaSa is an LLM-powered agent designed for comprehensive academic paper search, targeting researchers and academics. It automates complex scholarly queries by autonomously invoking search tools, reading papers, and selecting relevant references, aiming to deliver accurate and thorough results.
How It Works
PaSa employs a two-agent architecture: a Crawler and a Selector. The Crawler interacts with search tools, expands citations, and manages a paper queue. The Selector evaluates papers in the queue based on user query criteria, assigning relevance scores. This modular design allows for specialized optimization of search and evaluation tasks.
Quick Start & Requirements
transformers
and trl
repositories, install them in editable mode (pip install -e .
), and then install project dependencies (pip install -r requirements.txt
).pasa-7b-crawler
and pasa-7b-selector
.python run_paper_agent.py
.trl
and transformers
codebases. Detailed SFT and PPO training scripts are provided.Highlighted Details
Maintenance & Community
The project is associated with ByteDance. Citation details are available in BibTeX format. Links to community channels (Discord/Slack) or roadmaps are not explicitly provided in the README.
Licensing & Compatibility
The project's license is not explicitly stated in the README. The code modifications to trl
and transformers
suggest potential licensing implications from those libraries. Compatibility for commercial use or closed-source linking is not detailed.
Limitations & Caveats
The system requires a Google Search API key, which may incur costs. Training custom agents involves significant computational resources and requires familiarity with accelerate
and deepspeed
. The project's reliance on external APIs and specific library versions might impact long-term stability.
2 months ago
1 week