short-video-maker  by gyoridavid

CLI tool for automated short-form video creation

Created 5 months ago
696 stars

Top 49.0% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an open-source, automated tool for creating short-form videos for platforms like TikTok and Instagram Reels, serving as a free alternative to expensive API services and GPU-intensive generation tools. It targets content creators and developers looking to automate video production from text inputs.

How It Works

The tool orchestrates several components to generate videos: Kokoro TTS converts input text to speech, Whisper.cpp generates accurate captions from the audio, Pexels API provides background video clips based on search terms, and Remotion composes these elements into a final video with styled captions. This approach allows for programmatic video creation without requiring users to manually source assets or perform complex editing.

Quick Start & Requirements

  • Install/Run: Docker is recommended.
    • docker run -it --rm --name short-video-maker -p 3123:3123 -e LOG_LEVEL=debug -e PEXELS_API_KEY=<your_pexels_api_key> gyoridavid/short-video-maker:latest-tiny
  • Prerequisites:
    • NPM (if not using Docker)
    • Docker (recommended)
    • Pexels API Key (free)
    • Linux (Ubuntu ≥ 22.04) or macOS. Windows is not supported.
    • Minimum 3GB RAM (4GB recommended), 2 vCPUs, 5GB disk space.
    • For GPU acceleration (Whisper.cpp), an NVIDIA GPU and CUDA are required (use latest-cuda Docker image).
  • Resources: Official Docker images and documentation are available.

Highlighted Details

  • Supports both REST API and Model Context Protocol (MCP) for integration with AI agents like n8n.
  • Offers customization for music genre/mood, voice, caption styling, and video orientation.
  • Includes a basic health check and status endpoints for video generation jobs.
  • Provides endpoints to list available voices and music tags.

Maintenance & Community

The project was open-sourced by the AI Agents A-Z YouTube Channel. Contributions are welcome via Pull Requests.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

  • Currently supports English voiceovers only due to Kokoro.js limitations.
  • Background videos are sourced exclusively from Pexels.
  • Does not support stitching user-provided images or videos.
  • Remotion rendering is CPU-intensive; GPU acceleration is limited to Whisper.cpp.
Health Check
Last Commit

2 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
3
Star History
108 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Jiaming Song Jiaming Song(Chief Scientist at Luma AI).

MoneyPrinterTurbo by harry0703

0.4%
40k
AI tool for one-click short video generation from text prompts
Created 1 year ago
Updated 3 months ago
Feedback? Help us improve.