Speech processing tools (punctuation, diarization)
Top 79.1% on sourcepulse
🤗 Speechbox provides tools for speech processing tasks, primarily punctuation restoration and ASR with speaker diarization. It targets developers and researchers working with audio data, offering a streamlined way to enhance transcriptions.
How It Works
Punctuation restoration leverages Whisper models by forcing them to predict specific words while allowing modifications to capitalization, spacing, and punctuation. This approach capitalizes on Whisper's strong language understanding to infer correct punctuation. The ASR with Speaker Diarization pipeline combines a speech recognition system (like Whisper) with a speaker diarization model to attribute speech segments to specific speakers.
Quick Start & Requirements
pip install speechbox
pip install transformers datasets pyannote.audio
Highlighted Details
tiny.en
to medium.en
).Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
1 day