FLARE by jzbjyb

Research paper implementation for active retrieval-augmented generation

Created 2 years ago

662 stars

Top 50.9% on SourcePulse

View on GitHub

4 Experts Love This Project

Malte Pietsch

Cofounder of deepset

Jeff Hammerbacher

Cofounder of Cloudera

Travis Fischer

Founder of Agentic

Edward Sun

Research Scientist at Meta Superintelligence Lab

Project Summary

FLARE is a retrieval-augmented generation (RAG) framework designed to improve the quality and relevance of generated text by actively anticipating future content. It is targeted at researchers and developers working with large language models who need to enhance their output with external knowledge. FLARE's core innovation lies in its proactive retrieval mechanism, which can lead to more informed and contextually appropriate generations.

How It Works

FLARE employs a forward-looking approach where it predicts the next sentence to be generated. If this predicted sentence contains low-confidence tokens, FLARE uses it as a query to retrieve relevant documents. This active retrieval strategy aims to fetch information before it's explicitly needed, thereby enriching the generation process with anticipated context. The advantage is a more dynamic and potentially more accurate generation, as it doesn't solely rely on reactive retrieval triggered by current low confidence.

Quick Start & Requirements

Install: Follow setup.sh after creating a conda environment.
Prerequisites:
- Wikipedia dump (psgs_w100.tsv.gz) from DPR repository.
- Elasticsearch 7.17.9 for indexing the Wikipedia dump.
- Bing Search API key (for WikiASP dataset experiments).
- OpenAI API keys (placed in keys.sh).
Setup: Requires downloading a Wikipedia dump, setting up Elasticsearch, and configuring API keys. Running experiments involves multiple OpenAI API calls per example, which can be costly.
Links: DPR Repository, Bing Web Search API.

Highlighted Details

Implements "Forward-Looking Active Retrieval Augmented Generation" as described in the paper "Active Retrieval Augmented Generation."
Supports multiple datasets like 2WikiMultihopQA and WikiASP.
Allows configuration via JSON files (e.g., configs/2wikihop_flare_config.json).
Includes a debug mode for step-by-step walkthroughs of the retrieval and generation process.

Maintenance & Community

The project is associated with authors Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, and Graham Neubig. Further community or maintenance details are not explicitly provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. Given the association with academic research and the nature of the dependencies (OpenAI API, Elasticsearch), users should verify licensing for commercial use.

Limitations & Caveats

Experiments are described as "relatively expensive" due to repeated OpenAI API calls. The setup involves several external services (Elasticsearch, Bing API) and data downloads, increasing the complexity of initial deployment.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days