stable-ts  by jianfch

SDK for enhanced audio transcription using OpenAI's Whisper

created 2 years ago
1,953 stars

Top 22.9% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This library enhances OpenAI's Whisper for more accurate transcription timestamps and advanced audio processing. It's designed for researchers and developers needing precise control over ASR output, offering features like silence suppression, word-level alignment, and flexible output formatting.

How It Works

Stable-ts modifies Whisper's decoding process to improve timestamp reliability. It incorporates advanced post-processing techniques, including Voice Activity Detection (VAD) and custom regrouping algorithms, to refine segment boundaries and word timings. The library also supports various audio preprocessing steps like noise removal and frequency filtering.

Quick Start & Requirements

  • Install: pip install -U stable-ts
  • Prerequisites: FFmpeg (in PATH), PyTorch (ensure GPU support is installed separately if needed).
  • Usage: stable-ts audio.mp3 -o audio.srt
  • Documentation: https://github.com/jianfch/stable-ts

Highlighted Details

  • Timestamp Refinement: Offers methods like refine() and adjust_gaps() for precise timestamp tuning.
  • Regrouping: Advanced algorithms to restructure segments based on punctuation, gaps, length, or duration.
  • Alignment: Align existing text to audio with align() and align_words().
  • Multi-Model Support: Integrates with Whisper, Faster-Whisper, Hugging Face Transformers, and MLX.

Maintenance & Community

The project is actively maintained by jianfch. Community support channels are not explicitly mentioned in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Compatible with commercial and closed-source applications.

Limitations & Caveats

Refinement operations (refine()) are significantly slower when used with Faster-Whisper models compared to standard Whisper models.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
2
Issues (30d)
1
Star History
105 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.