pipes  by joboccara

C++ header-only library for expressive collection processing via pipelines

Created 8 years ago
830 stars

Top 42.8% on SourcePulse

GitHubView on GitHub
Project Summary

This C++ library provides a header-only, C++14 implementation for creating expressive data processing pipelines. It targets developers working with collections, offering a "push-based" alternative to C++20 ranges, enabling complex data transformations and routing with a fluent, pipe-like syntax.

How It Works

The library uses a "push-based" model where data flows sequentially through a chain of components called "pipes." Each pipe receives data, performs an operation (e.g., filtering, transforming), and passes the result to the next pipe in the pipeline. This design allows for flexible data routing, including branching (fork), merging (mux), and conditional processing (switch), which are distinct from the "pull-based" nature of C++20 ranges.

Quick Start & Requirements

  • Install: Header-only, no installation required. Include <pipes/pipes.hpp>.
  • Requirements: C++14 compliant compiler.
  • Demo: Fluent C++ article and examples in the repository.

Highlighted Details

  • Advanced Routing: Supports complex data flow patterns like fork (broadcast to multiple pipes), mux (process multiple collections in parallel), cartesian_product, and unzip.
  • STL Integration: Pipes can be used as output iterators for standard algorithms (e.g., std::copy, std::set_difference).
  • Stream Processing: Includes pipes for reading from and writing to streams (read_in_stream, to_out_stream).
  • Custom Aggregation: map_aggregator and set_aggregator allow custom merging logic for elements with duplicate keys or values.

Maintenance & Community

  • Developed by joboccara.
  • Contributions are welcome; issues can be logged for enhancements or bugs.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The library is under active development and subject to change. While it offers advanced routing, it lacks features like infinite ranges found in C++20 ranges.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Wes McKinney Wes McKinney(Author of Pandas), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
2 more.

grain by google

1.2%
586
Python library for ML training data pipelines
Created 3 years ago
Updated 1 day ago
Starred by Yang Song Yang Song(Professor at Caltech; Research Scientist at OpenAI), Jeremy Howard Jeremy Howard(Cofounder of fast.ai), and
6 more.

PiPPy by pytorch

0.1%
781
PyTorch tool for pipeline parallelism
Created 3 years ago
Updated 1 year ago
Starred by Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
12 more.

datatrove by huggingface

0.5%
3k
Data processing library for large-scale text data
Created 2 years ago
Updated 3 weeks ago
Feedback? Help us improve.