Discover and explore top open-source AI tools and projects—updated daily.
mudlerHigh-performance C++ ASR inference
New!
Top 79.2% on SourcePulse
Summary
parakeet.cpp offers a C++17 inference engine for NVIDIA's NeMo Parakeet ASR models, built on ggml. It targets users needing efficient, dependency-light ASR without a Python runtime for inference. Key benefits include significantly faster CPU/GPU speeds than NeMo's PyTorch or whisper.cpp, byte-identical accuracy, and embeddable capabilities.
How It Works
This project is a from-scratch C++17 port focused purely on inference, utilizing ggml for efficient tensor operations across CPU and GPU backends (CUDA, Metal, etc.). It supports various Parakeet architectures (CTC, RNNT, TDT, hybrid) and sizes, including multilingual and streaming variants. The design prioritizes speed and minimal dependencies, enabling deployment in resource-constrained environments or integration via a flat C API, ensuring transcript accuracy against original NeMo models.
Quick Start & Requirements
git clone --recursive https://github.com/mudler/parakeet.cpp), then build with CMake (cmake -B build && cmake --build build -j). Shared library build: -DPARAKEET_SHARED=ON.torch (CPU) and nemo_toolkit[asr] is only for model conversion/validation. GGUF models are required for inference.https://github.com/mudler/parakeet.cpp.Highlighted Details
Maintenance & Community Associated with the LocalAI team. Specific maintenance, community channels, or roadmap details were not explicitly provided in the README.
Licensing & Compatibility Codebase is MIT licensed. Model weights are governed by NVIDIA's original Parakeet licenses; review these for commercial use compatibility.
Limitations & Caveats
Python is required solely for model conversion/validation, not inference. K-quantization requires the CLI tool post-conversion. GPU performance gains on specific models (e.g., pure CTC) may be less pronounced than NeMo due to ggml kernel reliance.
1 day ago
Inactive
sonos
NVIDIA
BlinkDL
ggml-org