llmware  by llmware-ai

Framework for enterprise RAG pipelines using small, specialized models

Created 2 years ago
14,456 stars

Top 3.4% on SourcePulse

GitHubView on GitHub
Project Summary

This library provides a unified framework for building enterprise Retrieval-Augmented Generation (RAG) pipelines using small, specialized LLMs. It targets developers and researchers seeking to deploy private, cost-effective, and adaptable LLM applications for tasks like fact-based question-answering, classification, and summarization. The core benefit is enabling rapid development of knowledge-based enterprise LLM applications with a focus on small, efficient models.

How It Works

The framework comprises two main components: a RAG Pipeline for managing the lifecycle of connecting knowledge sources to LLMs, and a catalog of over 50 small, specialized models (SLIM, BLING, DRAGON series) fine-tuned for enterprise tasks. It supports various model formats (GGUF, HuggingFace, Sentence Transformers) and integrates with multiple database backends (SQLite, MongoDB, PostgreSQL) and vector stores (Milvus, ChromaDB, FAISS, etc.). This approach allows for flexible deployment, from local laptops to scalable clusters, and emphasizes the use of smaller models for efficiency and privacy.

Quick Start & Requirements

  • Install: pip3 install llmware or pip3 install 'llmware[full]'
  • Prerequisites: Python 3.12+, optional Tesseract v5.3.3 and Poppler v23.10.0 for OCR. GPU support (CUDA) is beneficial but not strictly required for many models.
  • Setup: Minimal setup for basic use with SQLite and ChromaDB. More complex setups (e.g., MongoDB, Milvus) may require Docker.
  • Resources: Official examples and tutorials are available on the GitHub repository and YouTube channel.

Highlighted Details

  • Supports over 150 models, including RAG-optimized BLING, DRAGON, and industry-specific BERT models.
  • Offers a unified ModelCatalog for easy access to various model types and formats.
  • Provides a Library class for parsing, chunking, and indexing diverse document types (PDF, DOCX, WAV, JPG, etc.).
  • Features an Agents module (LLMfx) for multi-model workflows and function calling.

Maintenance & Community

The project is actively maintained with frequent releases and updates. Community engagement is encouraged via GitHub discussions.

Licensing & Compatibility

The project is released under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Some users have reported issues with PyTorch 2.3 and NumPy 2.0, recommending downgrading to compatible versions (PyTorch 2.1, NumPy < 2.0). Support for specific Linux versions might require raising an issue.

Health Check
Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
263 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Vasek Mlejnsky Vasek Mlejnsky(Cofounder of E2B).

super-rag by superagent-ai

0%
384
RAG pipeline for AI apps
Created 1 year ago
Updated 1 year ago
Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
2 more.

LightRAG by HKUDS

1.2%
21k
RAG framework for fast, simple retrieval-augmented generation
Created 11 months ago
Updated 2 days ago
Feedback? Help us improve.