Verba by weaviate

RAG chatbot for querying data, locally or via cloud deployment

Created 2 years ago

7,504 stars

Top 6.8% on SourcePulse

View on GitHub

10 Experts Love This Project

Tobi Lutke

Cofounder of Shopify

Dan Guido

Cofounder of Trail of Bits

and 6 more!

Project Summary

Verba is an open-source Retrieval Augmented Generation (RAG) chatbot designed for community use, enabling users to query and gain insights from their datasets. It offers a streamlined, end-to-end RAG experience, supporting local deployments with Ollama/Huggingface or cloud-based LLM providers like OpenAI, Anthropic, and Cohere.

How It Works

Verba integrates Weaviate's vector database with various RAG frameworks, data ingestion tools, and LLM providers. It supports flexible data chunking (token, sentence, semantic, recursive) and retrieval methods, allowing users to customize their RAG pipeline for specific use cases. The architecture emphasizes modularity, enabling easy swapping of embedding models, LLMs, and data loaders.

Quick Start & Requirements

Install: pip install goldenverba
Prerequisites: Python >=3.10.0,<3.13.0. Optional API keys for various LLM and data services (OpenAI, Anthropic, Cohere, Groq, Novita AI, Unstructured, AssemblyAI, etc.). Ollama requires separate installation and model download.
Docker: docker compose up -d --build
Docs: https://github.com/weaviate/Verba

Highlighted Details

Supports a wide array of LLM providers (Ollama, HuggingFace, Cohere, Anthropic, OpenAI, Groq, Novita AI, Upstage) and embedding models (Weaviate, Ollama, SentenceTransformers, Cohere, VoyageAI, OpenAI, Upstage).
Integrates with data ingestion tools like UnstructuredIO, Firecrawl, and AssemblyAI for various file types and multi-modal data.
Features hybrid search, autocomplete suggestions, filtering, customizable metadata, and asynchronous ingestion.
Offers Docker deployment and local Weaviate Embedded (experimental, not on Windows).

Maintenance & Community

This is a community-driven project, and while Weaviate supports it, maintenance urgency may vary. Contributions are welcomed via GitHub issues and discussions.

Licensing & Compatibility

The project is licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Weaviate Embedded is experimental and not supported on Windows. Verba is designed for single-user usage only, with no current plans for multi-user or role-based access. It does not offer external API endpoints for application interaction.

Health Check

Last Commit

6 months ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

49 stars in the last 30 days