pipes  by joboccara

C++ header-only library for expressive collection processing via pipelines

Created 8 years ago
829 stars

Top 42.8% on SourcePulse

GitHubView on GitHub
Project Summary

This C++ library provides a header-only, C++14 implementation for creating expressive data processing pipelines. It targets developers working with collections, offering a "push-based" alternative to C++20 ranges, enabling complex data transformations and routing with a fluent, pipe-like syntax.

How It Works

The library uses a "push-based" model where data flows sequentially through a chain of components called "pipes." Each pipe receives data, performs an operation (e.g., filtering, transforming), and passes the result to the next pipe in the pipeline. This design allows for flexible data routing, including branching (fork), merging (mux), and conditional processing (switch), which are distinct from the "pull-based" nature of C++20 ranges.

Quick Start & Requirements

  • Install: Header-only, no installation required. Include <pipes/pipes.hpp>.
  • Requirements: C++14 compliant compiler.
  • Demo: Fluent C++ article and examples in the repository.

Highlighted Details

  • Advanced Routing: Supports complex data flow patterns like fork (broadcast to multiple pipes), mux (process multiple collections in parallel), cartesian_product, and unzip.
  • STL Integration: Pipes can be used as output iterators for standard algorithms (e.g., std::copy, std::set_difference).
  • Stream Processing: Includes pipes for reading from and writing to streams (read_in_stream, to_out_stream).
  • Custom Aggregation: map_aggregator and set_aggregator allow custom merging logic for elements with duplicate keys or values.

Maintenance & Community

  • Developed by joboccara.
  • Contributions are welcome; issues can be logged for enhancements or bugs.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The library is under active development and subject to change. While it offers advanced routing, it lacks features like infinite ranges found in C++20 ranges.

Health Check
Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), Phil Wang Phil Wang(Prolific Research Paper Implementer), and
1 more.

grain by google

0.9%
536
Python library for ML training data pipelines
Created 3 years ago
Updated 23 hours ago
Starred by Yang Song Yang Song(Professor at Caltech; Research Scientist at OpenAI), Jeremy Howard Jeremy Howard(Cofounder of fast.ai), and
6 more.

PiPPy by pytorch

0%
779
PyTorch tool for pipeline parallelism
Created 3 years ago
Updated 1 year ago
Starred by Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
11 more.

datatrove by huggingface

0.9%
3k
Data processing library for large-scale text data
Created 2 years ago
Updated 2 days ago
Feedback? Help us improve.