atomic_queue  by max0x7ba

C++14 lock-free queue minimizing latency between push/pop operations

created 6 years ago
1,704 stars

Top 25.5% on sourcepulse

GitHubView on GitHub
Project Summary

This C++14 library provides highly optimized, lock-free, multiple-producer-multiple-consumer (MPMC) queues based on circular buffers. It targets developers building ultra-low-latency systems, aiming to minimize the time between pushing and popping elements by employing a minimalist design with minimal atomic operations and explicit cache-friendliness.

How It Works

The queues utilize fixed-size circular buffers and std::atomic operations. Key design principles include minimizing atomic instructions, avoiding false sharing, using a linear buffer array, and avoiding heap allocations post-construction. This minimalist approach, including value semantics (copy/move on push/pop), aims for synergy to maximize CPU performance by reducing cache misses and pipeline stalls. Power-of-2 buffer sizes enable efficient index mapping and cache line contention reduction.

Quick Start & Requirements

  • Install: Clone the repository and add atomic_queue/include to your build system's include paths. Alternatively, use vcpkg install atomic-queue or conan install atomic-queue.
  • Prerequisites: C++14 compliant compiler. Benchmarks require Intel TBB.
  • Links: GitHub, Benchmarks

Highlighted Details

  • Offers AtomicQueue (for atomic types) and AtomicQueue2 (for non-atomic types), with Optimist variants for busy-waiting.
  • Supports single-producer-single-consumer (SPSC) modes for further optimization.
  • Provides optional totally ordered mode with zero cost on Intel x86.
  • Benchmarked against std::mutex, boost::lockfree, moodycamel::ConcurrentQueue, and others, claiming superior latency and throughput.

Maintenance & Community

  • The project is maintained by Maxim Egorushkin.
  • Contributions are welcome.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Generally compatible with C++14 platforms supporting std::atomic. Reported compatible with Windows, but CI is Linux-only.

Limitations & Caveats

Queue size must be fixed at compile or construction time. The library relies on specific OS behaviors and thread scheduling (e.g., SCHED_FIFO) for optimal performance, and achieving the lowest latency requires careful system configuration. Debug builds include asserts for the NIL value, which must not be pushed.

Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
70 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
16 more.

flash-attention by Dao-AILab

0.6%
19k
Fast, memory-efficient attention implementation
created 3 years ago
updated 1 day ago
Feedback? Help us improve.