awesome-stream-processing  by risingwavelabs

Stream processing solutions for real-world challenges

Created 1 year ago
275 stars

Top 94.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This repository offers a practical, demo-driven approach to demystify stream processing for engineers and researchers. It addresses common perceptions of stream processing being overly complex or expensive by showcasing how modern tools, specifically RisingWave, can solve real-world data problems using accessible SQL. The collection provides executable examples for integrating with popular data sources like Kafka and PostgreSQL, enabling real-time analytics, ETL, and more advanced applications.

How It Works

The core methodology centers on RisingWave, a cloud-native stream processing database, which allows users to perform complex data transformations and analytics on streaming data using standard SQL. This SQL-centric approach significantly lowers the barrier to entry compared to traditional stream processing frameworks. The demos illustrate fundamental stream processing patterns, including continuous data ingestion, transformation, and offloading, alongside sophisticated use cases like real-time ETL pipelines and building event-driven applications.

Quick Start & Requirements

Initial setup involves installing Kafka, PostgreSQL, and RisingWave. A foundational understanding of Kafka and PostgreSQL is beneficial. Crucially, all demos are designed to be executable on a standard laptop, eliminating the need for cluster infrastructure, and have been validated on Ubuntu and macOS environments.

Highlighted Details

  • Features a Retrieval-Augmented Generation (RAG) demo, illustrating how to build LLM-powered applications with RisingWave by managing document embeddings and retrieval.
  • Provides direct performance comparisons between RisingWave and Apache Flink by running identical workloads.
  • Showcases AI agent integrations, enabling natural language querying of streaming data and automating data engineering tasks.
  • Includes extensive end-to-end streaming lakehouse demonstrations, integrating RisingWave with Apache Iceberg to build robust, real-time data platforms queried by engines like Spark and DuckDB.

Maintenance & Community

A dedicated Slack community channel is available for users to engage in discussions, seek support, and connect with other stream processing enthusiasts.

Licensing & Compatibility

The specific open-source license governing this repository is not explicitly stated within the provided README content.

Limitations & Caveats

The provided README content does not detail any specific limitations, alpha statuses, or known caveats of the demos or the underlying technology.

Health Check
Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Maxime Beauchemin Maxime Beauchemin(Author of Apache Airflow, Superset; Founder of Preset), and
3 more.

bytewax by bytewax

0.2%
2k
Python framework for stateful stream processing
Created 3 years ago
Updated 9 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

pathway by pathwaycom

0.8%
57k
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG
Created 3 years ago
Updated 19 hours ago
Feedback? Help us improve.