CLI tool for transcribing online videos/podcasts to Markdown
Top 80.9% on sourcepulse
yt2doc is a command-line tool that transcribes YouTube videos and Apple Podcasts episodes into readable Markdown documents. It focuses on post-processing transcriptions to improve readability, offering features like topic segmentation and chaptering for unchaptered content, making it ideal for researchers, content creators, and anyone needing structured summaries of audio-visual material.
How It Works
yt2doc leverages the Whisper ASR model for initial transcription and optionally integrates with local LLM servers (like Ollama) for advanced post-processing. For unchaptered content, it uses LLMs to segment transcripts into logical topics and generate headings, enhancing readability beyond raw Whisper output. It also supports sentence and paragraph segmentation via the Segment Any Text (SaT) model.
Quick Start & Requirements
pipx install yt2doc
or uv tool install yt2doc
.ffmpeg
to be installed and available in the system's PATH.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
5 months ago
1 day