atomic_queue  by max0x7ba

C++14 lock-free queue minimizing latency between push/pop operations

Created 7 years ago
1,728 stars

Top 24.7% on SourcePulse

GitHubView on GitHub
Project Summary

This C++14 library provides highly optimized, lock-free, multiple-producer-multiple-consumer (MPMC) queues based on circular buffers. It targets developers building ultra-low-latency systems, aiming to minimize the time between pushing and popping elements by employing a minimalist design with minimal atomic operations and explicit cache-friendliness.

How It Works

The queues utilize fixed-size circular buffers and std::atomic operations. Key design principles include minimizing atomic instructions, avoiding false sharing, using a linear buffer array, and avoiding heap allocations post-construction. This minimalist approach, including value semantics (copy/move on push/pop), aims for synergy to maximize CPU performance by reducing cache misses and pipeline stalls. Power-of-2 buffer sizes enable efficient index mapping and cache line contention reduction.

Quick Start & Requirements

  • Install: Clone the repository and add atomic_queue/include to your build system's include paths. Alternatively, use vcpkg install atomic-queue or conan install atomic-queue.
  • Prerequisites: C++14 compliant compiler. Benchmarks require Intel TBB.
  • Links: GitHub, Benchmarks

Highlighted Details

  • Offers AtomicQueue (for atomic types) and AtomicQueue2 (for non-atomic types), with Optimist variants for busy-waiting.
  • Supports single-producer-single-consumer (SPSC) modes for further optimization.
  • Provides optional totally ordered mode with zero cost on Intel x86.
  • Benchmarked against std::mutex, boost::lockfree, moodycamel::ConcurrentQueue, and others, claiming superior latency and throughput.

Maintenance & Community

  • The project is maintained by Maxim Egorushkin.
  • Contributions are welcome.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Generally compatible with C++14 platforms supporting std::atomic. Reported compatible with Windows, but CI is Linux-only.

Limitations & Caveats

Queue size must be fixed at compile or construction time. The library relies on specific OS behaviors and thread scheduling (e.g., SCHED_FIFO) for optimal performance, and achieving the lowest latency requires careful system configuration. Debug builds include asserts for the NIL value, which must not be pushed.

Health Check
Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Johannes Hagemann Johannes Hagemann(Cofounder of Prime Intellect), and
4 more.

S-LoRA by S-LoRA

0.2%
2k
System for scalable LoRA adapter serving
Created 1 year ago
Updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Ying Sheng Ying Sheng(Coauthor of SGLang).

fastllm by ztxz16

0.4%
4k
High-performance C++ LLM inference library
Created 2 years ago
Updated 1 week ago
Feedback? Help us improve.