rank_llm by castorini

Python toolkit for reproducible information retrieval research

Created 2 years ago

568 stars

Top 56.6% on SourcePulse

View on GitHub

2 Experts Love This Project

Project Summary

RankLLM is a Python toolkit for reproducible information retrieval research, focusing on listwise reranking of documents. It supports a variety of reranking models, including open-source LLMs compatible with vLLM, SGLang, and TensorRT-LLM, as well as proprietary models like RankGPT and RankGemini. The toolkit aims to streamline the process of evaluating and comparing reranking strategies for researchers and practitioners in information retrieval.

How It Works

RankLLM facilitates end-to-end retrieval and reranking pipelines. It integrates with various retrieval methods (e.g., BM25, SPLADE) and offers a flexible reranking interface. Key to its design is support for efficient inference, including reranking using only first-token logits and compatibility with optimized LLM serving frameworks like vLLM, SGLang, and TensorRT-LLM for improved throughput.

Quick Start & Requirements

Installation: pip install -r requirements.txt (after creating a conda environment with Python 3.10 and activating it).
Prerequisites: JDK 21 (required for Anserini), PyTorch with CUDA 12.1. Optional: SGLang (requires flashinfer), TensorRT-LLM.
Platform: Linux or Windows; macOS is not supported.
Demo: See the provided Python code snippet for a quick start example and src/rank_llm/demo for additional samples.

Highlighted Details

Supports listwise, pairwise, and pointwise reranking models.
Offers optimized inference via first-token logits and integration with vLLM, SGLang, and TensorRT-LLM.
Includes a model zoo with various LiT5, MonoT5, RankZephyr, and RankVicuna variants.
Provides scripts for end-to-end runs and two-click reproduction (2CR) for specific datasets and models.

Maintenance & Community

The project is actively maintained, with contributions from Ronak Pradeep, Sahel Sharifymoghaddam, and Jimmy Lin. Citations for key models and methodologies are provided, indicating active research and development.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, it relies on and cites models and code from other projects, suggesting potential licensing considerations for commercial use or closed-source integration.

Limitations & Caveats

RankLLM is explicitly not compatible with macOS. The requirement for JDK 21 is strict, with JDK 11 being unsupported. Some advanced features or model backends (like SGLang and TensorRT-LLM) require additional installations and configurations.

Health Check

Last Commit

2 weeks ago

Responsiveness

1+ week

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days