Gradio app for transcribing YouTube videos
Top 78.9% on sourcepulse
This project provides a simple Gradio web application for transcribing YouTube videos. It targets users who need to quickly convert spoken content from YouTube videos into text, leveraging OpenAI's Whisper model for accurate speech-to-text conversion.
How It Works
The application extracts the audio stream from a given YouTube URL using the pytube
library. This audio is then processed by OpenAI's Whisper model, a powerful, open-source automatic speech recognition system known for its robustness across various audio conditions and languages. The extracted text transcript is then presented to the user via the Gradio interface.
Quick Start & Requirements
conda env create -f environment.yml
, conda activate yt-whisper
), then run python app.py
.Highlighted Details
Maintenance & Community
No specific information on maintainers, community channels, or roadmap is provided in the README.
Licensing & Compatibility
Limitations & Caveats
The application relies on pytube
for YouTube access, which can be subject to breakage due to YouTube's frequent changes. FFmpeg must be correctly installed and accessible in the system's PATH for the application to function.
9 months ago
Inactive