pocketsphinx.js by syl22-00

Speech recognition in JavaScript and WebAssembly

Created 12 years ago

1,507 stars

Top 27.2% on SourcePulse

Project Summary

This project provides a speech recognition engine that runs entirely within a web browser using JavaScript and WebAssembly, based on the PocketSphinx C library. It's designed for web developers and researchers looking to integrate speech input into web applications without relying on server-side processing, offering offline capabilities and direct microphone access.

How It Works

The core of the project is PocketSphinx, a C-based speech recognition engine, compiled to JavaScript and WebAssembly using Emscripten. This allows it to run efficiently in the browser. An accompanying audioRecorder.js library, built on the Web Audio API, handles microphone input, sample rate conversion, and data buffering, feeding it to the PocketSphinx engine. For better performance and to avoid blocking the UI thread, the recognizer.js wrapper utilizes Web Workers to run the speech recognition process in the background.

Quick Start & Requirements

Installation: Clone the repository with git clone --recursive https://github.com/syl22-00/pocketsphinx.js.git.
Prerequisites: Emscripten SDK (including Node.js, CMake), Python 2 (for the example server).
Running Demo: Serve the webapp/live.html file using a local web server (e.g., python server.py) and open it in a browser. Chrome may require --disable-web-security flag.
Documentation: Detailed API and usage are available in the README and within the doc/ directory.

Highlighted Details

Supports both JavaScript (asm.js) and WebAssembly compilation targets.
Enables runtime addition of words, grammars (FSG), and keyword spotting phrases.
Includes an audioRecorder.js module for microphone input and processing.
Offers a recognizer.js wrapper for efficient use within Web Workers.
Provides examples for English and Chinese speech recognition.

Maintenance & Community

The project appears to have had its last significant update around 2017. There are no readily available links to active community channels like Discord or Slack mentioned in the README.

Licensing & Compatibility

The core PocketSphinx.js library and associated files are licensed under the MIT License. The audioRecorder.js and audioRecorderWorker.js files are based on Recorder.js, also under the MIT License. This permissive licensing allows for commercial use and integration into closed-source applications.

Limitations & Caveats

The project's last commit was in 2017, suggesting potential maintenance gaps and compatibility issues with modern browser APIs or Emscripten versions. Performance and accuracy are highly dependent on acoustic and language models, and initial results may be poor without proper tuning.

pocketsphinx.js by syl22-00

Explore Similar Projects

praises by ElmTran

bumblebee by jaxcore

orate by haydenbleasel

curses by mmpneo

echogarden by echogarden-project

LLaSM by LinkSoul-AI

Speech-to-Text-Russian by SergeyShk

sherpa-ncnn by k2-fsa

WhisperLiveKit by QuentinFuxa

sherpa-onnx by k2-fsa

speech_recognition by Uberi

openai-fm by openai