ColiVara by tjmlabs

Document retrieval API using visual embeddings for enhanced RAG

Created 1 year ago

1,451 stars

Top 27.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Rodrigo Nader

Cofounder of Langflow

Project Summary

ColiVara offers a Retrieval Augmented Generation (RAG) solution that bypasses traditional text extraction and chunking by using vision models to create document embeddings. This approach aims to improve retrieval accuracy and performance, especially for visually rich documents, by leveraging both textual and visual cues. It is designed for developers and researchers seeking advanced document retrieval capabilities.

How It Works

ColiVara utilizes the ColPali model, which employs Vision Language Models to generate embeddings that capture both textual and visual information within documents. Unlike methods relying on OCR or text chunking, ColiVara processes documents as images, enabling it to interpret layouts, tables, and figures. This "late-interaction" embedding strategy is claimed to be more accurate than pooled embeddings, even for text-only datasets.

Quick Start & Requirements

Install: pip install colivara-py or npm install colivara-ts
Prerequisites: A free API key from the ColiVara website. The embedding service (ColiVarE) requires a GPU with at least 8GB VRAM. Local setup involves Docker and PostgreSQL with pgVector.
Documentation: docs.colivara.com
Swagger: ColiVara API Swagger

Highlighted Details

State-of-the-art retrieval performance and latency, outperforming existing systems.
Supports over 100 file formats by converting them to images internally.
Leverages PostgreSQL with pgVector, utilizing HalfVecs for efficient search and storage.
Offers filtering capabilities on document and collection metadata.

Maintenance & Community

Support is available via a Discord community.
The project is actively developed, with Release 1.5.0 introducing hierarchical clustering.
Independent evaluations are conducted and available in the ColiVara-eval repository.

Licensing & Compatibility

Licensed under Functional Source License, Version 1.1, Apache 2.0 Future License.
Commercial licensing requires contacting tjmlabs.com. The FSL may have restrictions on commercial use or linking.

Limitations & Caveats

The core embedding service (ColiVarE) requires a GPU with at least 8GB VRAM, which may be a barrier for some users. The licensing model, combining FSL and Apache 2.0, requires careful review for commercial applications.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

14 stars in the last 30 days