ColiVara  by tjmlabs

Document retrieval API using visual embeddings for enhanced RAG

created 10 months ago
1,186 stars

Top 33.6% on sourcepulse

GitHubView on GitHub
Project Summary

ColiVara offers a Retrieval Augmented Generation (RAG) solution that bypasses traditional text extraction and chunking by using vision models to create document embeddings. This approach aims to improve retrieval accuracy and performance, especially for visually rich documents, by leveraging both textual and visual cues. It is designed for developers and researchers seeking advanced document retrieval capabilities.

How It Works

ColiVara utilizes the ColPali model, which employs Vision Language Models to generate embeddings that capture both textual and visual information within documents. Unlike methods relying on OCR or text chunking, ColiVara processes documents as images, enabling it to interpret layouts, tables, and figures. This "late-interaction" embedding strategy is claimed to be more accurate than pooled embeddings, even for text-only datasets.

Quick Start & Requirements

  • Install: pip install colivara-py or npm install colivara-ts
  • Prerequisites: A free API key from the ColiVara website. The embedding service (ColiVarE) requires a GPU with at least 8GB VRAM. Local setup involves Docker and PostgreSQL with pgVector.
  • Documentation: docs.colivara.com
  • Swagger: ColiVara API Swagger

Highlighted Details

  • State-of-the-art retrieval performance and latency, outperforming existing systems.
  • Supports over 100 file formats by converting them to images internally.
  • Leverages PostgreSQL with pgVector, utilizing HalfVecs for efficient search and storage.
  • Offers filtering capabilities on document and collection metadata.

Maintenance & Community

  • Support is available via a Discord community.
  • The project is actively developed, with Release 1.5.0 introducing hierarchical clustering.
  • Independent evaluations are conducted and available in the ColiVara-eval repository.

Licensing & Compatibility

  • Licensed under Functional Source License, Version 1.1, Apache 2.0 Future License.
  • Commercial licensing requires contacting tjmlabs.com. The FSL may have restrictions on commercial use or linking.

Limitations & Caveats

The core embedding service (ColiVarE) requires a GPU with at least 8GB VRAM, which may be a barrier for some users. The licensing model, combining FSL and Apache 2.0, requires careful review for commercial applications.

Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
297 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.