youtube-shorts-pipeline  by rushindrasinha

News-to-YouTube Shorts automation pipeline

Created 1 month ago
1,581 stars

Top 26.0% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project automates the creation and uploading of YouTube Shorts from a single topic. It targets content creators and marketers seeking to scale video production by handling the entire workflow: research, scripting, AI-generated visuals, voiceovers, captions, music, and final upload. The primary benefit is transforming a topic into a published Short in minutes, significantly reducing manual effort.

How It Works

The pipeline orchestrates a series of AI-driven stages. It begins with Draft, using DuckDuckGo for research and Claude for script generation, b-roll prompts, and YouTube metadata. The Produce stage leverages Gemini Imagen for AI visuals (with Ken Burns effect), ElevenLabs for voiceovers, and Whisper for word-timed captions (ASS/SRT). It then integrates royalty-free background music with automatic voice ducking before assembling the final video using ffmpeg. Finally, the Upload stage publishes the Short to YouTube with metadata and captions. This approach integrates multiple specialized AI services into a cohesive, automated content factory.

Quick Start & Requirements

Installation involves pip install -r requirements.txt. The pipeline is initiated via python -m pipeline run --news "your topic here" --dry-run, which launches a setup wizard on first execution. Key requirements include API keys for Anthropic (Claude) and Google Gemini (Imagen), and YouTube OAuth credentials. ElevenLabs API key is optional for TTS. Estimated cost per video is approximately $0.11. Official setup and troubleshooting guides are available at references/setup.md and references/troubleshooting.md.

Highlighted Details

  • Topic Discovery Engine: Aggregates trending topics from Reddit, RSS feeds, Google Trends, Twitter/X, and TikTok.
  • AI Content Generation: Employs Claude for scripting and prompts, Gemini Imagen for b-roll visuals and thumbnails, and ElevenLabs for voiceovers.
  • Advanced Video Assembly: Features burned-in, word-highlighted ASS subtitles, automatic background music with voice ducking, and SRT caption generation.
  • Robustness & Resumability: Includes exponential backoff for API calls, structured logging, a comprehensive test suite (78 tests), and state tracking to resume pipeline execution from the point of failure.

Maintenance & Community

The provided README does not detail specific contributors, sponsorships, or community channels like Discord or Slack.

Licensing & Compatibility

The project is released under the MIT license, which is permissive for commercial use and integration into closed-source applications, requiring only attribution.

Limitations & Caveats

Operation is contingent on obtaining and configuring multiple paid API keys (Anthropic, Gemini), leading to per-video operational costs. While Claude incorporates an "anti-hallucination gate," reliance on LLMs means generated content may require human review. Setup involves managing Python dependencies and API credentials. The use of AI for core content generation introduces inherent variability and potential for unexpected outputs.

Health Check
Last Commit

6 days ago

Responsiveness

Inactive

Pull Requests (30d)
5
Issues (30d)
3
Star History
1,433 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.