openWakeWord by dscripka

Open-source wakeword detection library for voice-enabled apps

Created 3 years ago

1,667 stars

Top 25.2% on SourcePulse

Project Summary

This project provides an open-source framework for wake word detection, enabling developers to build voice-enabled applications. It offers pre-trained models for common English phrases, focusing on performance and ease of use for real-world applications.

How It Works

The framework utilizes a three-component architecture: an ONNX-based melspectrogram pre-processing function, a shared feature extraction backbone (re-implemented from a Google TFHub module) that generates speech embeddings, and a classification model (e.g., fully-connected or RNN) for wake word detection. This modular design allows for efficient processing and easier modification, with models processing audio in 80ms frames.

Quick Start & Requirements

Install via pip: pip install openwakeword
Linux requires sudo apt-get install libspeexdsp-dev for optional Speex noise suppression.
Supports Python 3.8+.
Pre-trained models can be downloaded via openwakeword.utils.download_models().
Online demo available on HuggingFace Spaces.

Highlighted Details

Claims competitive performance against commercial offerings like Picovoice Porcupine and Mycroft Precise.
Models are trained on synthetic data, demonstrating robustness to whispered speech, varied speaking speeds, and phrasing variations.
Includes optional Speex noise suppression and Silero VAD integration for improved performance in noisy environments.
Offers a simplified training process with Google Colab notebooks, allowing custom model creation in under an hour.

Maintenance & Community

Active development with releases noted up to February 2024.
Community contributions acknowledged, including a Docker implementation by @dalehumby and a C++ version by @synesthesiam.
Links to examples, training notebooks, and community discussions are provided.

Licensing & Compatibility

Code is licensed under Apache 2.0.
Pre-trained models are licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International, restricting commercial use.

Limitations & Caveats

The project is English-only due to reliance on English TTS models for training data. It is not recommended for highly constrained edge devices or microcontrollers, with alternatives like microWakeWord suggested for such use cases. Commercial use of pre-trained models is restricted by the CC-BY-NC-SA 4.0 license.

openWakeWord by dscripka

Explore Similar Projects

deepspeech-german by AASHISHAG

edgedict by theblackcat102

zamia-speech by gooofy

moonshine by moonshine-ai

athena by athena-team

parler-tts by huggingface

sherpa-onnx by k2-fsa

TTS by mozilla

speechbrain by speechbrain

PaddleSpeech by PaddlePaddle

DeepSpeech by mozilla

TTS by coqui-ai