whisper-node by ariym

Node.js bindings for OpenAI's Whisper, enabling local transcription

created 2 years ago

284 stars

Top 93.1% on sourcepulse

Project Summary

This project provides Node.js bindings for OpenAI's Whisper, enabling local audio transcription using the C++ CPU-optimized version by ggerganov. It targets Node.js developers seeking efficient, on-device speech-to-text capabilities, offering benefits like data privacy and reduced latency.

How It Works

The library leverages the whisper.cpp project, a C++ implementation of OpenAI's Whisper model optimized for CPU execution, including Apple Silicon. It exposes this functionality through a Node.js interface, allowing developers to transcribe .wav files directly within their Node.js applications. The core advantage is the ability to run Whisper locally without relying on external APIs, making it suitable for privacy-sensitive applications or environments with limited internet connectivity.

Quick Start & Requirements

Install via npm: npm install whisper-node
For Windows: Install the make command.
Input files must be 16kHz .wav format. FFmpeg can be used for conversion (e.g., ffmpeg -i input.mp3 -ar 16000 output.wav).
Official documentation and examples are available in the README.

Highlighted Details

Supports various output formats: JSON (default), .txt, .srt, .vtt.
Offers word-level timestamp precision.
Allows specifying Whisper model names (e.g., base.en) or custom model paths.
Includes options for language detection and file generation flags.

Maintenance & Community

The project is maintained by Ariym. The roadmap indicates ongoing development, with plans to add support for config files, browser/WASM compatibility, automatic audio conversion, speaker diarization (Pyanote, WhisperX), and audio stream transcription.

Licensing & Compatibility

The project is licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Currently, only .wav files at 16kHz are supported as input; automatic conversion from other formats is a planned feature. The project is still under active development, with several roadmap items indicating features not yet implemented.

whisper-node by ariym

Explore Similar Projects

echogarden by echogarden-project

pywhispercpp by absadiki

stream-translator by fortypercnt

whispercpp by aarnphm

whisper-website by Kabanosk

transcriber_app by davabase

Whisperboard by Saik0s

transcribe-anything by zackees

writeout.ai by beyondcode

stable-ts by jianfch

whisper_streaming by ufal

ecoute by SevaSk