Qmedia by QmiAI

Open-source AI content search engine for content creators

Created 1 year ago

590 stars

Top 55.1% on SourcePulse

Project Summary

QMedia is an open-source AI content search engine tailored for content creators, enabling efficient extraction and querying of information from text, images, and short videos. It facilitates local deployment of a web app, RAG server, and LLM server, offering multimodal RAG capabilities for private data Q&A.

How It Works

QMedia employs a modular architecture with three core services: mm_server for multimodal model inference (LLMs, image/video analysis, embeddings), mmrag_server for content extraction, embedding, storage, and RAG-based Q&A, and qmedia_web as a Next.js-based frontend. This separation allows flexible deployment and integration, with Python/LlamaIndex powering the backend services and TypeScript/Next.js for the frontend.

Quick Start & Requirements

Install/Run: Requires separate setup for mm_server, mmrag_server, and qmedia_web. Example commands involve cd into directories and running Python scripts (main.py) or pnpm dev.
Prerequisites: Python environment with specific dependencies (e.g., qllm, qmedia conda environments), Node.js/pnpm for the web frontend. Local Ollama models (e.g., llama3:8b-instruct) are supported.
Resources: Local deployment of models may require significant GPU resources.
Docs: Installation Instructions, Usage

Highlighted Details

Supports multimodal RAG for text, image, and short video content.
Features "Content Cards" for displaying extracted information, inspired by XHS.
Offers pure local deployment of various models, including CLIP for image embedding, BGE for text embedding, Faster Whisper for video transcription, and LLaVA for visual understanding.
Allows independent use of model services via API.

Maintenance & Community

Active development indicated by changelog and issue tracking.
Community channels include Discord and WeChat groups.
Twitter handle: @Lafe8088.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and closed-source linking.

Limitations & Caveats

The project is structured into multiple services, requiring separate setup and management. While local deployment is emphasized, specific hardware requirements for running multimodal models locally are not detailed. Future plans include viral content breakdown and similarity search.

Qmedia by QmiAI

Explore Similar Projects

meGPT by adrianco

MemeMeow by MemeMeow-Studio

summarize by steipete

second-brain by henrydaum

llm-mcp-rag by KelvinQiu802

frogbase by hayabhay

auto-news by finaldie

Lumos by andrewnguonly

tap4-ai-crawler by 6677-ai

ALwrity by AJaySi

morphik-core by morphik-org

BiliNote by JefferyHcool