chronon  by airbnb

Data platform for AI/ML applications

created 4 years ago
827 stars

Top 43.8% on sourcepulse

GitHubView on GitHub
Project Summary

Chronon is a data platform designed to simplify and standardize data computation and serving for AI/ML applications. It enables users to define features as data transformations, supporting both batch and streaming computations, scalable backfills, low-latency serving, and robust observability. This platform is targeted at ML practitioners and engineers who need to leverage diverse data sources for model training and real-time inference without managing complex data infrastructure.

How It Works

Chronon utilizes a declarative API for defining features through GroupBy (aggregations over data sources) and Join (combining features for specific keys and timestamps) constructs. It translates these definitions into Spark jobs for scalable batch computation and backfills, ensuring point-in-time accuracy. For online serving, Chronon supports uploading computed features to a key-value store (like MongoDB) and provides APIs for low-latency fetching, enabling consistent feature retrieval for real-time model inference.

Quick Start & Requirements

Highlighted Details

  • Supports complex transformations and windowed aggregations.
  • Guarantees and measures online/offline consistency.
  • Handles scalable, resilient, and point-in-time accurate backfills.
  • Offers managed pipelines for batch and real-time feature computation.

Maintenance & Community

  • Community support via Slack workspace.
  • Contributions welcomed via GitHub issues.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The provided quickstart does not cover running streaming jobs. The Java client example for online fetching is illustrative and not runnable within the Docker environment.

Health Check
Last commit

23 hours ago

Responsiveness

1 day

Pull Requests (30d)
14
Issues (30d)
0
Star History
38 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems).

bytewax by bytewax

0.3%
2k
Python framework for stateful stream processing
created 3 years ago
updated 4 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Daniel Han Daniel Han(Cofounder of Unsloth), and
1 more.

airweave by airweave-ai

0.6%
3k
Semantic MCP server for AI agents
created 7 months ago
updated 2 days ago
Feedback? Help us improve.