ingestr  by bruin-data

CLI for seamless data transfer across databases and platforms

Created 2 years ago
3,709 stars

Top 13.0% on SourcePulse

GitHubView on GitHub
Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Ingestr is a CLI tool designed to simplify data ingestion between diverse databases and platforms. It targets engineers and users needing to move data without writing code, offering a seamless, single-command solution that abstracts backend complexity and reduces development overhead.

How It Works

Ingestr operates as a command-line interface, allowing users to specify source and destination URIs and table names via simple flags. The tool manages the underlying data transfer logic, eliminating the need for custom scripts or backend infrastructure management. This approach streamlines data pipelines and accelerates data movement.

Quick Start & Requirements

  • Installation:
    • Install script: curl -LsSf https://getbruin.com/install/ingestr | sh
    • Pip: pip install ingestr
  • Prerequisites: Requires valid credentials for source and destination systems.
  • Example:
    ingestr ingest \
        --source-uri 'postgresql://admin:admin@localhost:8837/web?sslmode=disable' \
        --source-table 'public.some_data' \
        --dest-uri 'bigquery://<your-project-name>?credentials_path=/path/to/service/account.json' \
        --dest-table 'ingestr.some_data'
    
  • Links: Full documentation and Slack community links are available.

Highlighted Details

  • Supports a broad spectrum of sources and destinations, including major databases (PostgreSQL, BigQuery, Snowflake, MySQL) and platforms (S3, GCS, Kafka, Elasticsearch, Airtable, Salesforce).
  • Features incremental loading capabilities, supporting append, merge, or delete+insert operations for efficient data updates.
  • Emphasizes a single-command installation and execution model, abstracting away complex backend configurations.

Maintenance & Community

  • Community: A Slack community is available for users.
  • Contributing: Contributions are welcome via pull requests, but users are advised to open an issue first for discussion.

Licensing & Compatibility

  • License: Source-available under the Functional Source License 1.1, with a future transition to Apache 2.0.
  • Compatibility: Permitted for internal production use, development, testing, education, research, and professional services. Commercial use is restricted if offering a competing ingestion, ELT, connector, or managed data pipeline product/service. Each version becomes Apache 2.0 two years post-release.

Limitations & Caveats

The Functional Source License 1.1 imposes restrictions on offering competing commercial ingestion services. While free for internal use, this license may limit adoption for companies building commercial data pipeline products.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
160
Issues (30d)
16
Star History
249 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), and
3 more.

risingwave by risingwavelabs

0.1%
9k
Stream processing and serving for AI agents and real-time data applications
Created 4 years ago
Updated 12 hours ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

pathway by pathwaycom

0.0%
63k
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG
Created 3 years ago
Updated 15 hours ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Chaoyu Yang Chaoyu Yang(Founder of Bento).

seatunnel by apache

0.1%
9k
High-performance multimodal data integration
Created 8 years ago
Updated 13 hours ago
Feedback? Help us improve.