mic_array by respeaker

Mic array utils for audio processing

Created 8 years ago

324 stars

Top 84.1% on SourcePulse

Project Summary

This repository provides utilities for the ReSpeaker Microphone Array, enabling Direction of Arrival (DOA) estimation, Voice Activity Detection (VAD), and Keyword Spotting (KWS). It targets developers and researchers working with multi-microphone arrays for audio processing, voice control, and spatial awareness applications. The primary benefit is the integration of these advanced audio features with specific hardware.

How It Works

The project leverages the 8-channel raw audio output from the ReSpeaker hardware. For DOA, it likely employs beamforming or similar spatial audio techniques to pinpoint sound sources. VAD is implemented using the WebRTC VAD library for efficient speech detection. KWS is integrated with the Snowboy engine for wake-word recognition. The scripts demonstrate how to control the device's LED ring and process audio streams for these functionalities.

Quick Start & Requirements

Install: sudo pip install pyusb for pixel ring control; pip install webrtcvad for VAD. Snowboy requires sudo apt-get install python-dev libatlas-base-dev swig and manual compilation.
Prerequisites: ReSpeaker USB Mic Array with firmware updated for 8-channel raw audio output. For 4-mic arrays, modify scripts. Python 3 is assumed.
Setup: Basic setup involves installing Python packages and potentially configuring udev rules for USB access. Snowboy compilation can take several minutes.
Links: mic_array_dfu, Google Assistant Library, ODAS, ODAS Studio

Highlighted Details

Integrates DOA, VAD, and KWS on ReSpeaker hardware.
Includes a script (pixel_ring.py) for controlling the device's LED ring via USB HID.
Provides examples for integrating with Google Assistant and the ODAS (Open Acoustic Device) framework for advanced sound source localization.
Requires specific firmware flashing for full 8-channel audio support.

Maintenance & Community

The repository is maintained by respeaker. Links to community resources like Discord or Slack are not explicitly provided in the README.

Licensing & Compatibility

The repository itself appears to be under a permissive license, but the integrated Snowboy KWS engine has its own licensing terms which may impact commercial use. Compatibility with closed-source applications would depend on the licensing of Snowboy and any other third-party components.

Limitations & Caveats

The README notes potential issues with SWIG versions during Snowboy compilation, requiring manual Makefile edits. Full functionality, especially 8-channel audio, depends on flashing specific device firmware. The project's reliance on Snowboy, which is no longer actively maintained by its original developers, may pose a long-term risk.

mic_array by respeaker

Explore Similar Projects

Transcribro by soupslurpr

LiveWhisper by Nikorasu

AIVoiceChat by KoljaB

vui by fluxions-ai

ollama-voice-mac by apeatling

chatgpt-conversation by platelminto

mimic-recording-studio by MycroftAI

whisper_mic by mallorbc

athena-signal by athena-team

Kimi-Audio by MoonshotAI

RealtimeSTT by KoljaB

sherpa-onnx by k2-fsa