vespa  by vespa-engine

Platform for AI + data, online serving at scale

Created 9 years ago
6,349 stars

Top 8.1% on SourcePulse

GitHubView on GitHub
Project Summary

Vespa is an open-source AI + Data platform designed for organizing, searching, and inferring on diverse data types (vectors, tensors, text, structured data) at scale. It targets developers building applications requiring low-latency responses (under 100ms) for complex operations like search, recommendation, and personalization, even with continuously changing data.

How It Works

Vespa addresses the challenge of performing complex data operations and model inferences on large, dynamic datasets within strict latency requirements. It achieves this through a distributed, high-availability architecture that can process and serve data from multiple nodes in parallel. The platform is optimized for real-time data ingestion and querying, enabling sophisticated AI-driven applications.

Quick Start & Requirements

  • Install: Deploy to Vespa Cloud (cloud.vespa.ai) or run a self-hosted instance (docs.vespa.ai/en/getting-started.html).
  • Development Environment: AlmaLinux 8 for C++/Java builds. Java 17 and Maven 3.8+ required for Java modules. Docker is recommended for a full development setup.
  • Resources: Building Vespa requires a complete development environment setup.

Highlighted Details

  • Handles vectors, tensors, text, and structured data.
  • Designed for sub-100ms query latency.
  • Supports continuous data updates.
  • Used in large-scale internet services serving high query volumes.

Maintenance & Community

  • Daily releases (Mon-Thu CET) from the master branch.
  • Contribution guidelines available in CONTRIBUTING.md.
  • Documentation repository: github.com/vespa-engine/documentation.

Licensing & Compatibility

  • Licensed under the Apache 2.0 license.
  • Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

Building Vespa from source requires a specific Linux environment (AlmaLinux 8) or careful setup of Java 17 and Maven on other platforms. While cloud deployment is available, self-hosting requires adherence to build environment specifications.

Health Check
Last Commit

13 hours ago

Responsiveness

1 day

Pull Requests (30d)
226
Issues (30d)
8
Star History
61 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
12 more.

mindsdb by mindsdb

0.3%
36k
AI query engine for federated data sources
Created 7 years ago
Updated 13 hours ago
Feedback? Help us improve.