sherpa-ncnn  by k2-fsa

Offline STT engine for real-time speech recognition and VAD

created 2 years ago
1,422 stars

Top 29.2% on sourcepulse

GitHubView on GitHub
Project Summary

Sherpa-ncnn provides efficient, offline, real-time speech recognition and voice activity detection (VAD) for a wide range of devices and architectures. It targets developers building applications requiring on-device ASR and VAD, offering broad platform support and multiple language bindings.

How It Works

This project leverages the ncnn inference framework for optimized execution on diverse hardware, including CPUs and mobile platforms. It supports streaming speech-to-text and VAD, enabling real-time processing without internet connectivity. The architecture is designed for static linking, producing executables with minimal system dependencies beyond standard libraries.

Quick Start & Requirements

Highlighted Details

  • Supports x86, x86_64, ARM (32/64-bit), and RISC-V (64-bit) architectures.
  • Available for Linux, macOS, Windows, Android, iOS, and WebAssembly.
  • Provides APIs for C++, C, Python, JavaScript, Go, C#, Kotlin, and Swift.
  • Does not depend on PyTorch or other heavy inference frameworks, relying solely on ncnn.

Maintenance & Community

Licensing & Compatibility

  • The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project's licensing is not clearly stated in the README, which may impact commercial adoption. Specific performance benchmarks or detailed resource requirements for various platforms are not provided.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
4
Star History
121 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.