Discover and explore top open-source AI tools and projects—updated daily.
jipraksAI-powered tool for YouTube short-form video generation
Top 72.1% on SourcePulse
This project addresses the challenge of efficiently repurposing long-form YouTube content into engaging short-form videos for platforms like TikTok, Instagram Reels, and YouTube Shorts. It targets content creators and media professionals seeking to automate the time-consuming process of identifying highlights, editing, and formatting clips, thereby increasing content reach and engagement with minimal manual effort.
How It Works
The pipeline leverages AI to automate video transformation. It begins by downloading YouTube videos and their subtitles using yt-dlp. Subsequently, GPT-4 analyzes the transcript to identify key engaging segments (60-120 seconds), generating hook text for intros. The video is then processed: clipped to the identified segments, converted to a 9:16 aspect ratio with intelligent speaker tracking (using OpenCV or MediaPipe), and enhanced with AI-generated TTS voiceovers for hooks. Finally, Whisper API generates word-by-word highlighted captions, and FFmpeg burns them into the video, producing ready-to-publish clips with SEO-optimized metadata.
Quick Start & Requirements
Python 3.10+, FFmpeg 4.4+, yt-dlp), install Python dependencies (pip install -r requirements.txt), and run python app.py.Highlighted Details
Maintenance & Community
Contributions are welcomed via GitHub issues for bug reports and feature requests, and pull requests for code improvements. A detailed CONTRIBUTING.md guide is available, including a Git tutorial for beginners.
Licensing & Compatibility
This project is licensed under the MIT License. The disclaimer specifies the tool is for personal/educational use only, and users must ensure they have the rights to the content they process and respect YouTube's Terms of Service.
Limitations & Caveats
The tool is explicitly stated for personal/educational use, and users are responsible for content rights and YouTube's ToS compliance. The MediaPipe face detection mode is noted as being 2-3 times slower than the OpenCV alternative.
2 days ago
Inactive
harry0703