havenask  by alibaba

Distributed search engine for large-scale information retrieval

Created 2 years ago
1,723 stars

Top 24.8% on SourcePulse

GitHubView on GitHub
Project Summary

Havenask is a large-scale distributed information search system developed by Alibaba Group, designed to provide high-performance, low-cost, and user-friendly search services for businesses. It supports real-time search on hundreds of billions of data records, achieving millions of queries per second (QPS) and writes per second (TPS) with millisecond query latency and second-level data updates.

How It Works

Built on a C++ core for enhanced performance, memory efficiency, and stability, Havenask offers a familiar SQL query interface. Its architecture emphasizes flexibility through a rich plugin mechanism for scalability and graphical development capabilities that allow for rapid algorithm iteration, making it suitable for next-generation intelligent search applications.

Quick Start & Requirements

  • Install/Run: Use provided Docker images (ha3_runtime for running, ha3_dev for development). A create_container.sh script is available.
  • Prerequisites: Docker installed and running. For single-node mode, passwordless SSH to localhost is required.
  • Resource Requirements:
    • Runtime: CPU > 2 cores, Memory > 4GB, Disk > 20GB.
    • Development: CPU > 2 cores, Memory > 10GB, Disk > 50GB.
  • Documentation: Havenask Wiki

Highlighted Details

  • Supports real-time search on hundreds of billions of data records with millions of QPS/TPS.
  • C++ underlying structure for performance, memory, and stability.
  • SQL query support for a user-friendly experience.
  • Rich plugin system for scalability and customization.
  • Supports graphical development for rapid algorithm iteration.
  • Supports vector search for multimodal search scenarios.

Maintenance & Community

  • Contact via DingTalk group (details in README).

Licensing & Compatibility

  • License information is not explicitly stated in the README.

Limitations & Caveats

  • The README notes significant differences between old and main branches, advising users to ensure code consistency with the current main branch. Docker images are currently only available for amd64 architecture.
Health Check
Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
18 stars in the last 30 days

Explore Similar Projects

Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Chenlin Meng Chenlin Meng(Cofounder of Pika), and
9 more.

clip-retrieval by rom1504

0.2%
3k
CLIP retrieval system for semantic search
Created 4 years ago
Updated 1 month ago
Feedback? Help us improve.