morphik-core  by morphik-org

Open-source multi-modal RAG for building AI apps over private knowledge

created 8 months ago
3,010 stars

Top 16.3% on sourcepulse

GitHubView on GitHub
Project Summary

Morphik provides a multi-modal Retrieval Augmented Generation (RAG) framework for building AI applications over private knowledge bases, targeting developers who need to process and query complex, visual documents. It offers a unified approach to ingest, search, transform, and manage unstructured and multimodal data, enabling advanced search capabilities and knowledge graph construction.

How It Works

Morphik employs techniques like ColPali for multimodal search, allowing queries across images, PDFs, videos, and more via a single endpoint. It facilitates the creation of domain-specific knowledge graphs with a single line of code, leveraging battle-tested system prompts or custom ones. The system also offers fast and scalable metadata extraction, including bounding boxes and classification, and features a cache-augmented generation mechanism for faster responses by creating persistent KV-caches of documents.

Quick Start & Requirements

  • Install/Run: Python SDK (pip install morphik) or Docker.
  • Prerequisites: Python, Docker (optional).
  • Resources: Sign up for a free tier at Morphik for hosted access. Self-hosting instructions are available.
  • Links: Morphik, Self-hosting Instructions, Discord Community.

Highlighted Details

  • Multimodal search across images, PDFs, videos, and more.
  • One-line knowledge graph generation.
  • Fast metadata extraction (bounding boxes, classification).
  • Cache-augmented generation for improved speed.

Maintenance & Community

The project welcomes contributions via GitHub issues and pull requests. Focus areas include speed improvements, tool integrations, and research paper integration. Community support is available via Discord.

Licensing & Compatibility

Features outside the ee namespace are open-source under the MIT Expat license. Features within the ee namespace have a different license and are not available in the open-source version.

Limitations & Caveats

Full support for open-source deployments is limited due to resource constraints, with community support available via Discord. Certain features, like the Morphik Console, are exclusive to the paid/hosted version.

Health Check
Last commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
7
Issues (30d)
10
Star History
995 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Alex Cheema Alex Cheema(Cofounder of EXO Labs), and
3 more.

Perplexica by ItzCrazyKns

0.3%
23k
AI-powered search engine alternative
created 1 year ago
updated 2 days ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

LightRAG by HKUDS

1.1%
19k
RAG framework for fast, simple retrieval-augmented generation
created 10 months ago
updated 2 days ago
Feedback? Help us improve.