CLI tool for bulk YouTube video transcription
Top 56.9% on sourcepulse
This Python tool automates the transcription of YouTube playlists and individual videos using OpenAI's Whisper, enhanced by faster-whisper for local inference and optional SpaCy integration for NLP. It's designed for researchers, content creators, and educators needing to convert video audio into structured text with metadata.
How It Works
The tool downloads audio from YouTube videos using pytube
, then transcribes it locally with faster-whisper
(leveraging CUDA for GPU acceleration) or optionally via the OpenAI API. Transcripts are processed for sentence splitting using either SpaCy or regex, and metadata like timestamps and confidence scores are generated. Output includes plain text, CSV, and JSON formats, plus an HTML reader for improved usability.
Quick Start & Requirements
pip install -r requirements.txt
bulk_transcribe_youtube_videos_from_playlist.py
) for video/playlist URL, and transcription method.python bulk_transcribe_youtube_videos_from_playlist.py
pytube
, faster-whisper
, spacy
(optional), and optionally CUDA for GPU acceleration.Highlighted Details
faster-whisper
(recommended for accuracy) and OpenAI API transcription.tqdm
for progress tracking and asyncio
for concurrent downloads.Maintenance & Community
The project is maintained by Dicklesworthstone. Contributions are welcomed via pull requests on GitHub.
Licensing & Compatibility
Licensed under the MIT License. This permissive license allows for commercial use and integration into closed-source projects.
Limitations & Caveats
The OpenAI API transcription uses an older, less accurate Whisper model compared to the local faster-whisper
implementation. CUDA support is required for GPU acceleration, otherwise, CPU will be used.
5 months ago
1 day