byaldi  by AnswerDotAI

SDK for late-interaction multi-modal models

created 11 months ago
806 stars

Top 44.7% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Byaldi is a Python library designed to simplify the use of late-interaction multi-modal retrieval models, specifically those compatible with the ColPali framework. It targets developers and researchers looking to quickly integrate advanced multi-modal search capabilities into their applications, offering a familiar API for rapid prototyping and development.

How It Works

Byaldi acts as a lightweight wrapper around the ColPali repository, abstracting away the complexities of loading and indexing multi-modal models. It leverages ColPali's underlying engine, which supports multi-billion parameter models like ColQwen2 checkpoints, enabling efficient retrieval across various data types. The library's design prioritizes ease of use, mirroring RAGatouille's approach to minimize code required for setting up a retrieval pipeline.

Quick Start & Requirements

  • Install via pip: pip install --upgrade byaldi
  • Recommended: pip install flash-attn
  • Requires Poppler for PDF conversion (install instructions provided for macOS and Debian/Ubuntu).
  • GPU recommended for optimal performance; CPU/MPS encoding will be slow.
  • Official documentation and examples are available via the project's README.

Highlighted Details

  • Supports ColPali-compatible models, including ColQwen2 checkpoints (e.g., vidore/colqwen2-v1.0).
  • Enables indexing of PDF files, image files, or directories.
  • Allows optional storage of base64 encoded documents within the index for simplified LLM integration.
  • Supports adding documents to existing indexes.

Maintenance & Community

This is a pre-release version of Byaldi, and users are encouraged to report issues. The project aims to evolve with the multi-modal ecosystem, with plans to support additional backends like VisRAG and features such as HNSW indexing and quantization.

Licensing & Compatibility

The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

This is a pre-release version with potential quirks and unrefined features. Performance on CPU or MPS for encoding is expected to be poor. The project is actively under development, and future updates may introduce breaking changes.

Health Check
Last commit

6 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
2
Star History
31 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

LightRAG by HKUDS

1.0%
19k
RAG framework for fast, simple retrieval-augmented generation
created 10 months ago
updated 1 day ago
Feedback? Help us improve.