FlashRank  by PrithivirajDamodaran

Reranking library for search & retrieval pipelines

created 1 year ago
840 stars

Top 43.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

FlashRank is a Python library designed to integrate state-of-the-art re-ranking capabilities into search and retrieval pipelines. It offers both lightweight, CPU-compatible pairwise re-rankers and more powerful, listwise LLM-based re-rankers, targeting developers and researchers seeking to enhance search result relevance without significant overhead.

How It Works

FlashRank leverages pre-trained cross-encoder and LLM models for re-ranking. Cross-encoder models process query-passage pairs directly for high accuracy, while LLM-based listwise re-rankers consider the entire list of passages simultaneously for potentially better contextual understanding. The library emphasizes minimal dependencies, with its smallest model requiring no PyTorch or Transformers, enabling CPU execution and small deployment footprints.

Quick Start & Requirements

  • Install with pip install flashrank for pairwise re-rankers or pip install flashrank[listwise] for LLM-based re-rankers.
  • Supports various models, including a ~4MB default CPU-compatible model and larger LLM-based models requiring more resources.
  • Official documentation and examples are available.

Highlighted Details

  • Ultra-lite design: No Torch or Transformers required for the default model, enabling CPU usage and a ~4MB model size.
  • Super-fast performance: Optimized for speed, with re-ranking time dependent on token count, query length, and model depth.
  • Cost-conscious: Small package size and CPU compatibility reduce costs in serverless environments like AWS Lambda.
  • Supports a range of models: Includes cross-encoders fine-tuned on MS MARCO and ESCI, a non-cross-encoder LLM (rank-T5-flan), a multilingual BERT, and a 4-bit quantized GGUF LLM (rank_zephyr_7b_v1_full).

Maintenance & Community

The project is maintained by Prithiviraj Damodaran and is open for contributions. Links to relevant papers and citation information are provided.

Licensing & Compatibility

The library is licensed under the Apache 2.0 license, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The rank_zephyr_7b_v1_full model currently supports a maximum of 20 passages per pass, with sliding window logic yet to be implemented. Users should adjust max_length based on their passage token counts for optimal performance.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
54 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.