nixiesearch  by nixiesearch

Search engine for hybrid text+semantic retrieval, designed for ease of ops

Created 2 years ago
498 stars

Top 62.3% on SourcePulse

GitHubView on GitHub
Project Summary

Nixiesearch is a hybrid search engine designed for consumer-facing applications, offering a modern alternative to complex Elasticsearch/OpenSearch clusters. It targets developers seeking a scalable, maintainable search solution with integrated semantic search capabilities, aiming to simplify operations and reduce infrastructure headaches.

How It Works

Nixiesearch leverages Apache Lucene for core search functionalities, including multi-language support, faceting, and advanced filtering. Its key innovation is a decoupled architecture where compute and storage are separated, with all index data residing on S3-compatible storage. This stateless design enables risk-free backups, upgrades, schema changes, and auto-scaling. It supports pull-based indexing via Spark ETL for both offline and online incremental updates, avoiding direct cluster manipulation.

Quick Start & Requirements

  • Install/Run: docker run -itp 8080:8080 -v .:/data nixiesearch/nixiesearch:latest standalone -c /data/config.yml
  • Prerequisites: Docker, S3-compatible storage, ONNX Runtime (for local embedding inference, SaaS providers also supported).
  • Setup: Minimal setup time, primarily configuration.
  • Links: Quickstart Guide, Live Demo

Highlighted Details

  • Hybrid search (text + semantic) with integrated embedding inference and RAG API support.
  • Decoupled S3-based storage and stateless compute for simplified operations and scalability.
  • Apache Lucene foundation provides robust features like multi-language support, faceting, and suggestions.
  • Immutable index configurations facilitate easy blue-green deployments.

Maintenance & Community

The project is open-source, inspired by search solutions from Uber, Amazon, and DoorDash. Community links and roadmap details are not explicitly provided in the README.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

Nixiesearch is explicitly not a database or a log search tool, focusing solely on relevance-based search for consumer applications. It is not designed for analytical workloads.

Health Check
Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
27
Issues (30d)
2
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Simon Willison Simon Willison(Coauthor of Django).

semantra by freedmand

0.1%
3k
CLI tool for semantic document search
Created 2 years ago
Updated 1 year ago
Starred by Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research) and Andre Zayarni Andre Zayarni(Cofounder of Qdrant).

kernel-memory by microsoft

0.2%
2k
RAG architecture for indexing and querying data using LLMs
Created 2 years ago
Updated 1 day ago
Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Simon Horup Eskildsen Simon Horup Eskildsen(Cofounder of Turbopuffer), and
21 more.

meilisearch by meilisearch

0.2%
53k
Search engine API for integrating AI-powered hybrid search
Created 7 years ago
Updated 1 day ago
Feedback? Help us improve.