CLI tool for YouTube full-text search and semantic analysis
Top 25.3% on sourcepulse
yt-fts provides command-line tools for searching YouTube channel transcripts, enabling users to find specific keywords or phrases within videos. It supports both traditional full-text search and advanced semantic search using OpenAI embeddings, making it valuable for researchers, content creators, and anyone needing to quickly locate information within extensive video archives.
How It Works
The tool leverages yt-dlp
to download subtitles for specified YouTube channels, storing them in a SQLite database for efficient querying. For semantic search, it integrates with the OpenAI API to generate embeddings for transcripts, which are then managed by ChromaDB. This dual approach allows for precise keyword matching and contextually relevant semantic retrieval, with an LLM chat interface for interactive Q&A powered by the retrieved information.
Quick Start & Requirements
pip install yt-fts
OPENAI_API_KEY
environment variable or via --openai-api-key
flag).--cookies-from-browser
).Highlighted Details
--jobs
) for faster ingestion.Maintenance & Community
The project is maintained by NotJoeMartinez. Community interaction channels are not explicitly listed in the README.
Licensing & Compatibility
The project appears to be licensed under the MIT License, allowing for commercial use and integration with closed-source projects.
Limitations & Caveats
Semantic search and LLM features are dependent on the OpenAI API, incurring potential costs. The update
command currently only refreshes full-text search data, not semantic embeddings. Search strings for full-text search are limited to 40 characters.
1 month ago
1 week