simba  by GitHamza0206

KMS for RAG systems

created 7 months ago
1,334 stars

Top 30.7% on sourcepulse

GitHubView on GitHub
Project Summary

Simba is an open-source, portable Knowledge Management System (KMS) designed to simplify knowledge integration with any Retrieval-Augmented Generation (RAG) system. It targets developers building AI solutions, offering a modular architecture, a user-friendly UI, and a Python SDK to streamline knowledge management tasks.

How It Works

Simba employs a modular architecture allowing flexible integration of various components like vector stores, embedding models, chunkers, and parsers. This design enables users to customize their RAG pipeline by selecting preferred tools. The system processes documents, chunks them, generates embeddings, and stores them in a vector database for efficient retrieval, facilitating seamless integration with downstream RAG applications.

Quick Start & Requirements

  • Install: pip install simba-core or pip install simba-client for SDK.
  • Prerequisites: Python 3.11+, Poetry, Redis 7.0+, Node.js 20+.
  • Configuration: Requires .env for API keys (e.g., OpenAI) and config.yaml for system settings (LLM, embedding models, vector stores, chunking).
  • Running: Start services with simba server, simba front, simba parsers. Docker deployment options are available for CPU, NVIDIA GPU, and Apple Silicon.
  • Docs: Simba SDK documentation is available.

Highlighted Details

  • Supports multiple vector stores, embedding models, chunkers, and parsers via its modular design.
  • Offers a modern UI for managing document chunks.
  • Provides a comprehensive Python SDK for programmatic access and integration.
  • Docker deployment options are available for various hardware configurations.

Maintenance & Community

The project is maintained by GitHamza0206. Support and inquiries can be directed via GitHub issues or by contacting Hamza Zerouali.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial use or closed-source linking.

Limitations & Caveats

The roadmap indicates that features like authentication, web scraping, cloud integrations, and enhanced UI are planned but not yet implemented. The project appears to be under active development, with some features still in progress.

Health Check
Last commit

4 weeks ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
153 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

LightRAG by HKUDS

1.0%
19k
RAG framework for fast, simple retrieval-augmented generation
created 10 months ago
updated 20 hours ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Tim J. Baek Tim J. Baek(Founder of Open WebUI), and
2 more.

llmware by llmware-ai

0.2%
14k
Framework for enterprise RAG pipelines using small, specialized models
created 1 year ago
updated 1 week ago
Feedback? Help us improve.