llmware by llmware-ai

Framework for enterprise RAG pipelines using small, specialized models

Created 2 years ago

14,453 stars

Top 3.4% on SourcePulse

View on GitHub

6 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Nir Gazit

Cofounder of Traceloop

Tim J. Baek

Founder of Open WebUI

Daniel Han

Cofounder of Unsloth

and 2 more!

Project Summary

This library provides a unified framework for building enterprise Retrieval-Augmented Generation (RAG) pipelines using small, specialized LLMs. It targets developers and researchers seeking to deploy private, cost-effective, and adaptable LLM applications for tasks like fact-based question-answering, classification, and summarization. The core benefit is enabling rapid development of knowledge-based enterprise LLM applications with a focus on small, efficient models.

How It Works

The framework comprises two main components: a RAG Pipeline for managing the lifecycle of connecting knowledge sources to LLMs, and a catalog of over 50 small, specialized models (SLIM, BLING, DRAGON series) fine-tuned for enterprise tasks. It supports various model formats (GGUF, HuggingFace, Sentence Transformers) and integrates with multiple database backends (SQLite, MongoDB, PostgreSQL) and vector stores (Milvus, ChromaDB, FAISS, etc.). This approach allows for flexible deployment, from local laptops to scalable clusters, and emphasizes the use of smaller models for efficiency and privacy.

Quick Start & Requirements

Install: pip3 install llmware or pip3 install 'llmware[full]'
Prerequisites: Python 3.12+, optional Tesseract v5.3.3 and Poppler v23.10.0 for OCR. GPU support (CUDA) is beneficial but not strictly required for many models.
Setup: Minimal setup for basic use with SQLite and ChromaDB. More complex setups (e.g., MongoDB, Milvus) may require Docker.
Resources: Official examples and tutorials are available on the GitHub repository and YouTube channel.

Highlighted Details

Supports over 150 models, including RAG-optimized BLING, DRAGON, and industry-specific BERT models.
Offers a unified ModelCatalog for easy access to various model types and formats.
Provides a Library class for parsing, chunking, and indexing diverse document types (PDF, DOCX, WAV, JPG, etc.).
Features an Agents module (LLMfx) for multi-model workflows and function calling.

Maintenance & Community

The project is actively maintained with frequent releases and updates. Community engagement is encouraged via GitHub discussions.

Licensing & Compatibility

The project is released under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Some users have reported issues with PyTorch 2.3 and NumPy 2.0, recommending downgrading to compatible versions (PyTorch 2.1, NumPy < 2.0). Support for specific Linux versions might require raising an issue.

Health Check

Last Commit

4 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

51 stars in the last 30 days