whisper-youtube by ArthurFDLR

Colab notebook for YouTube video transcription

Created 3 years ago

414 stars

Top 70.9% on SourcePulse

Project Summary

This repository provides a Google Colab notebook for transcribing YouTube videos using OpenAI's Whisper model. It's designed for users who need to quickly generate transcripts from video content, leveraging the power of Whisper's advanced speech recognition capabilities.

How It Works

The notebook integrates the whisper library and pytube for video downloading. It allows users to select from various Whisper model sizes (tiny to large) and specify the output format (e.g., .vtt). The process involves downloading the YouTube video, processing its audio, and then feeding it to the chosen Whisper model for transcription.

Quick Start & Requirements

Install/Run: Execute the provided Google Colab notebook.
Prerequisites: A Google account for Colab, and a GPU runtime enabled in Colab (T4, P100, or V100 recommended for speed).
Setup: Minimal setup within the Colab environment.
Links: Open in Colab

Highlighted Details

Supports multiple Whisper model sizes, from tiny to large.
Allows saving transcripts and audio to Google Drive.
Offers output in .vtt format, suitable for subtitles.
Demonstrates GPU utilization and VRAM requirements for different models.

Maintenance & Community

The repository is maintained by ArthurFDLR. No specific community channels or roadmap are detailed in the README.

Licensing & Compatibility

The repository itself does not specify a license. It relies on the OpenAI Whisper library, which is typically distributed under a permissive license (e.g., MIT). Compatibility for commercial use would depend on the underlying Whisper license.

Limitations & Caveats

The notebook is designed for Google Colab and may require adjustments for local execution. Transcription speed is heavily dependent on the selected GPU and video length.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days