simba  by GitHamza0206

KMS for RAG systems

Created 9 months ago
1,360 stars

Top 29.6% on SourcePulse

GitHubView on GitHub
Project Summary

Simba is an open-source, portable Knowledge Management System (KMS) designed to simplify knowledge integration with any Retrieval-Augmented Generation (RAG) system. It targets developers building AI solutions, offering a modular architecture, a user-friendly UI, and a Python SDK to streamline knowledge management tasks.

How It Works

Simba employs a modular architecture allowing flexible integration of various components like vector stores, embedding models, chunkers, and parsers. This design enables users to customize their RAG pipeline by selecting preferred tools. The system processes documents, chunks them, generates embeddings, and stores them in a vector database for efficient retrieval, facilitating seamless integration with downstream RAG applications.

Quick Start & Requirements

  • Install: pip install simba-core or pip install simba-client for SDK.
  • Prerequisites: Python 3.11+, Poetry, Redis 7.0+, Node.js 20+.
  • Configuration: Requires .env for API keys (e.g., OpenAI) and config.yaml for system settings (LLM, embedding models, vector stores, chunking).
  • Running: Start services with simba server, simba front, simba parsers. Docker deployment options are available for CPU, NVIDIA GPU, and Apple Silicon.
  • Docs: Simba SDK documentation is available.

Highlighted Details

  • Supports multiple vector stores, embedding models, chunkers, and parsers via its modular design.
  • Offers a modern UI for managing document chunks.
  • Provides a comprehensive Python SDK for programmatic access and integration.
  • Docker deployment options are available for various hardware configurations.

Maintenance & Community

The project is maintained by GitHamza0206. Support and inquiries can be directed via GitHub issues or by contacting Hamza Zerouali.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial use or closed-source linking.

Limitations & Caveats

The roadmap indicates that features like authentication, web scraping, cloud integrations, and enhanced UI are planned but not yet implemented. The project appears to be under active development, with some features still in progress.

Health Check
Last Commit

1 month ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Nir Gazit Nir Gazit(Cofounder of Traceloop), and
4 more.

llmware by llmware-ai

0.6%
14k
Framework for enterprise RAG pipelines using small, specialized models
Created 2 years ago
Updated 1 month ago
Feedback? Help us improve.