moss  by usemoss

Real-time semantic search runtime for AI agents

Created 6 months ago
294 stars

Top 90.0% on SourcePulse

GitHubView on GitHub
Project Summary

Moss provides a real-time semantic search runtime specifically engineered for conversational AI agents, addressing the latency issues inherent in traditional vector databases. It enables AI systems like voice bots and copilots to achieve sub-10 millisecond query responses, crucial for maintaining natural, real-time interaction. The primary benefit is making retrieval latency virtually disappear from the user experience, allowing for more fluid and responsive AI applications.

How It Works

Moss operates as a unified search runtime, integrating embedding and search functionalities directly. Unlike conventional vector databases built for batch analytics, Moss is optimized for low-latency, real-time querying. Its architecture features a thin SDK client (available in Python and TypeScript) that communicates over HTTPS with the Moss runtime. This runtime manages embedding models (including built-in options), indexing, and a high-speed search engine, abstracting away infrastructure management and complex tuning parameters like HNSW configurations or sharding.

Quick Start & Requirements

Installation is straightforward via package managers: pip install inferedge-moss for Python or npm install @inferedge/moss for TypeScript. Users require project credentials obtained from moss.dev, where a free tier is available. Official documentation and a Discord community are provided for support and guidance.

Highlighted Details

  • Achieves sub-10 ms semantic search with a P99 latency of 8 ms.
  • Includes built-in embedding models, eliminating the need for external API keys, though custom models can be integrated.
  • Supports metadata filtering using operators like $eq, $and, $in, and $near.
  • Offers extensive framework integrations, including LangChain, DSPy, Pipecat, and LiveKit, with examples provided for various use cases.
  • Benchmarks demonstrate significantly lower end-to-end query latency compared to popular vector databases like Pinecone, Qdrant, and ChromaDB.

Maintenance & Community

The project maintains an active community via Discord (https://moss.link/discord) and encourages contributions to new SDK bindings and framework integrations. It is backed by Y Combinator, indicating a level of startup validation.

Licensing & Compatibility

Moss is released under the permissive BSD 2-Clause License. This license allows for broad compatibility, including commercial use and integration into closed-source applications without significant restrictions.

Limitations & Caveats

While many integrations are available, some, such as Vercel AI SDK and CrewAI, are marked as "Coming soon." As a specialized "search runtime," its operational model differs from traditional databases, potentially requiring architectural adjustments for users accustomed to full database management features.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
86
Issues (30d)
18
Star History
276 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.