in-context-ralm by AI21Labs

Research paper code for in-context retrieval-augmented language models

Created 3 years ago

294 stars

Top 90.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jeff Hammerbacher

Cofounder of Cloudera

Project Summary

This repository provides code for reproducing experiments on In-Context Retrieval-Augmented Language Models (RALM) from a TACL paper. It's designed for researchers and practitioners interested in enhancing language models with external knowledge retrieval for improved performance on tasks like language modeling and question answering.

How It Works

The project leverages a retrieval-augmented approach, integrating Pyserini for efficient sparse retrieval (BM25) and the Hugging Face Transformers library for language model inference. It supports various LLMs (GPT-2, GPT-Neo, OPT) and allows for evaluation with or without retrieval, as well as with reranking strategies like logprob and oracle. The core idea is to condition language model generation on relevant retrieved documents, improving accuracy and knowledge grounding.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Pytorch installation should be specific to your CUDA version.
For BM25 retrieval, Java 11 is required and JAVA_HOME must be set.
Example retrieval data preparation: python prepare_retrieval_data.py --retrieval_type sparse --tokenizer_name $MODEL_NAME ...
Example evaluation: python eval_lm.py --model_name $MODEL_NAME ...
Official documentation and examples are available within the README.

Highlighted Details

Supports a wide range of open-source language models including GPT-2, GPT-Neo, and OPT variants up to 66B parameters.
Includes scripts for evaluating models with and without retrieval, and with reranking capabilities.
Provides experimental setup for question answering tasks on Natural Questions and TriviaQA datasets.
Code is based on established libraries like Transformers and Pyserini, facilitating integration and understanding.

Maintenance & Community

The project is from AI21 Labs, authors of the associated TACL paper. Specific community channels or active maintenance status are not detailed in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

Large models (OPT-30B, OPT-66B) require model parallelism, suggesting significant hardware requirements (e.g., 40GB A100 GPUs). The README does not specify the exact hardware configuration for all experiments or provide a clear roadmap for future development.

in-context-ralm by AI21Labs

Explore Similar Projects

MS-MARCO-Web-Search by microsoft

ChatKBQA by LHRLAB

stark by snap-stanford

primeqa by primeqa

FLARE by jzbjyb

Rankify by DataScienceUIBK

atlas by facebookresearch

rank_llm by castorini

raptor by parthsarthi03

DPR by facebookresearch

Chinese-LangChain by yanqiangmiffy

pyserini by castorini