colette  by jolibrain

Multimodal RAG for local document interaction

Created 9 months ago
285 stars

Top 91.9% on SourcePulse

GitHubView on GitHub
Project Summary

Colette provides a self-hosted, open-source solution for searching and interacting with technical documents locally, prioritizing data privacy. It is designed for users who need to analyze sensitive documents containing rich visual information, such as images, figures, and complex layouts, which are often lost in traditional text-based RAG systems. Colette's core innovation lies in its Vision-RAG (V-RAG) capabilities, enabling deeper document understanding by processing visual elements.

How It Works

Colette employs a Vision-RAG system that embeds and analyzes documents as images using Document Screenshot Embedding/ColPali retrievers and Vision Language Models (VLMs). This approach preserves visual context, offering a more comprehensive analysis than text-only methods. It also supports traditional text-based RAG pipelines, providing flexibility. The system is designed to handle diverse document types and visual content effectively.

Quick Start & Requirements

  • Primary Install: Docker (recommended) or pip.
    • Docker: docker pull docker.jolibrain.com/colette_gpu
    • Pip: git clone https://github.com/jolibrain/colette.git followed by pip install -e .[dev,trag]
  • Prerequisites: GPU with >= 24GB VRAM, >= 16GB RAM, >= 50GB Disk, Docker >= 24.0.0 & Docker Compose >= v2.26.1.
  • Links: Docker installation guide: Install Docker Engine. Documentation and FAQ are referenced but direct URLs are not provided. Troubleshooting guide for RAG errors: COLETTEv2_Restitution_2025_03_07_v0.3_JB_light.pdf.

Highlighted Details

  • Vision Retrieval-Augmented Generation (V-RAG) system for visual document analysis.
  • Text-based RAG system for unstructured text extraction and embedding.
  • Multi-Model Support for various embedders and Vision Language Models (VLMs).
  • Image Generation Integration leveraging the diffusers library.

Maintenance & Community

Colette was co-financed by Jolibrain, CNES, and Airbus, indicating significant industry backing. No specific community channels (like Discord or Slack) or active contributor information beyond the financing entities are detailed in the provided text.

Licensing & Compatibility

The specific open-source license under which Colette is distributed is not mentioned in the provided README content.

Limitations & Caveats

Colette acknowledges that RAG pipelines are inherently susceptible to errors, including retrieval failures, indexing issues, and inference LLM limitations. Users are advised that the system "will never work for everything" and are provided with a detailed troubleshooting guide to diagnose and address incorrect answers, often involving adjustments to indexing or inference models. Issue reporting is encouraged for unresolved problems.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.