seatunnel  by apache

High-performance multimodal data integration

created 8 years ago
8,722 stars

Top 5.8% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Apache SeaTunnel is a multimodal, high-performance, distributed data integration platform designed for synchronizing vast amounts of data daily. It targets data engineers and developers dealing with diverse data sources and complex synchronization scenarios, offering efficient resource utilization and robust data quality monitoring.

How It Works

SeaTunnel employs a distributed snapshot algorithm for data consistency and supports multiple execution engines including its native Zeta Engine, Apache Spark, and Apache Flink. It features JDBC multiplexing and log parsing for efficient multi-table and database synchronization, enabling high throughput and low latency. The platform supports batch-stream integration and offers over 100 connectors for various data sources, sinks, and transformations.

Quick Start & Requirements

  • Download SeaTunnel from the Official Website.
  • Requires selection of an execution engine (Zeta Engine, Spark, or Flink).
  • Refer to Installation Guide for detailed setup.

Highlighted Details

  • Supports integration of video, images, and binary files alongside structured and unstructured text data.
  • Offers over 100 connectors and is actively expanding its ecosystem.
  • Provides two job development methods: coding and visual management via the SeaTunnel Web Project.
  • Used by companies like Weibo, Tencent Cloud, and Sina.

Maintenance & Community

  • Active community with a Slack channel available: SeaTunnel Slack.
  • Contributions are welcomed via GitHub Repository.
  • Contact via mailing list: dev@seatunnel.apache.org.

Licensing & Compatibility

  • Licensed under the Apache 2.0 License, permitting commercial use.

Limitations & Caveats

  • While supporting multimodal data, detailed instructions for video, image, and binary file integration are found in separate documentation.
Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
74
Issues (30d)
62
Star History
94 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.