Research paper code for in-context retrieval-augmented language models
Top 93.1% on sourcepulse
This repository provides code for reproducing experiments on In-Context Retrieval-Augmented Language Models (RALM) from a TACL paper. It's designed for researchers and practitioners interested in enhancing language models with external knowledge retrieval for improved performance on tasks like language modeling and question answering.
How It Works
The project leverages a retrieval-augmented approach, integrating Pyserini for efficient sparse retrieval (BM25) and the Hugging Face Transformers library for language model inference. It supports various LLMs (GPT-2, GPT-Neo, OPT) and allows for evaluation with or without retrieval, as well as with reranking strategies like logprob and oracle. The core idea is to condition language model generation on relevant retrieved documents, improving accuracy and knowledge grounding.
Quick Start & Requirements
pip install -r requirements.txt
JAVA_HOME
must be set.python prepare_retrieval_data.py --retrieval_type sparse --tokenizer_name $MODEL_NAME ...
python eval_lm.py --model_name $MODEL_NAME ...
Highlighted Details
Maintenance & Community
The project is from AI21 Labs, authors of the associated TACL paper. Specific community channels or active maintenance status are not detailed in the README.
Licensing & Compatibility
The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.
Limitations & Caveats
Large models (OPT-30B, OPT-66B) require model parallelism, suggesting significant hardware requirements (e.g., 40GB A100 GPUs). The README does not specify the exact hardware configuration for all experiments or provide a clear roadmap for future development.
1 year ago
1 week