FlashRank by PrithivirajDamodaran

Reranking library for search & retrieval pipelines

Created 2 years ago

916 stars

Top 39.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Andrey Vasnetsov

Cofounder of Qdrant

Project Summary

FlashRank is a Python library designed to integrate state-of-the-art re-ranking capabilities into search and retrieval pipelines. It offers both lightweight, CPU-compatible pairwise re-rankers and more powerful, listwise LLM-based re-rankers, targeting developers and researchers seeking to enhance search result relevance without significant overhead.

How It Works

FlashRank leverages pre-trained cross-encoder and LLM models for re-ranking. Cross-encoder models process query-passage pairs directly for high accuracy, while LLM-based listwise re-rankers consider the entire list of passages simultaneously for potentially better contextual understanding. The library emphasizes minimal dependencies, with its smallest model requiring no PyTorch or Transformers, enabling CPU execution and small deployment footprints.

Quick Start & Requirements

Install with pip install flashrank for pairwise re-rankers or pip install flashrank[listwise] for LLM-based re-rankers.
Supports various models, including a ~4MB default CPU-compatible model and larger LLM-based models requiring more resources.
Official documentation and examples are available.

Highlighted Details

Ultra-lite design: No Torch or Transformers required for the default model, enabling CPU usage and a ~4MB model size.
Super-fast performance: Optimized for speed, with re-ranking time dependent on token count, query length, and model depth.
Cost-conscious: Small package size and CPU compatibility reduce costs in serverless environments like AWS Lambda.
Supports a range of models: Includes cross-encoders fine-tuned on MS MARCO and ESCI, a non-cross-encoder LLM (rank-T5-flan), a multilingual BERT, and a 4-bit quantized GGUF LLM (rank_zephyr_7b_v1_full).

Maintenance & Community

The project is maintained by Prithiviraj Damodaran and is open for contributions. Links to relevant papers and citation information are provided.

Licensing & Compatibility

The library is licensed under the Apache 2.0 license, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The rank_zephyr_7b_v1_full model currently supports a maximum of 20 passages per pass, with sliding window logic yet to be implemented. Users should adjust max_length based on their passage token counts for optimal performance.

FlashRank by PrithivirajDamodaran

Explore Similar Projects

pyversity by Pringled

embedding_rerank_retrieval by percent4

Rankify by DataScienceUIBK

RankGPT by sunnweiwei

rank_llm by castorini

rerankers by AnswerDotAI

raglite by superlinear-ai

Qwen3-Embedding by QwenLM

metarank by metarank

TrustRAG by gomate-community

pyserini by castorini

sentence-transformers by huggingface