Offline voice input tool for PC, transcribing speech to text
Top 13.1% on sourcepulse
CapsWriter-Offline is a PC-based speech-to-text and transcription tool designed for offline use, offering unlimited duration, low latency, and high accuracy for Chinese and English input. It caters to users needing efficient voice typing, real-time transcription, and audio/video file subtitling, with features like customizable hotwords and automatic diary logging.
How It Works
The tool utilizes Sherpa-onnx with Alibaba's Paraformer model for speech recognition and a separate punctuation model. It operates in a client-server architecture. The server handles model loading and processing, while the client captures audio, manages hotkeys, and sends data to the server. For transcription, it leverages FFmpeg and generates SRT subtitles with word-level timestamps.
Quick Start & Requirements
requirements-server.txt
and requirements-client.txt
. macOS requires sudo
for core_client.py
and may need brew install protobuf
.models
folder.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
kaldi-native-fbank
on Linux to avoid symbol errors.1 year ago
1 day