short-video-maker  by gyoridavid

CLI tool for automated short-form video creation

created 3 months ago
554 stars

Top 58.7% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides an open-source, automated tool for creating short-form videos for platforms like TikTok and Instagram Reels, serving as a free alternative to expensive API services and GPU-intensive generation tools. It targets content creators and developers looking to automate video production from text inputs.

How It Works

The tool orchestrates several components to generate videos: Kokoro TTS converts input text to speech, Whisper.cpp generates accurate captions from the audio, Pexels API provides background video clips based on search terms, and Remotion composes these elements into a final video with styled captions. This approach allows for programmatic video creation without requiring users to manually source assets or perform complex editing.

Quick Start & Requirements

  • Install/Run: Docker is recommended.
    • docker run -it --rm --name short-video-maker -p 3123:3123 -e LOG_LEVEL=debug -e PEXELS_API_KEY=<your_pexels_api_key> gyoridavid/short-video-maker:latest-tiny
  • Prerequisites:
    • NPM (if not using Docker)
    • Docker (recommended)
    • Pexels API Key (free)
    • Linux (Ubuntu ≥ 22.04) or macOS. Windows is not supported.
    • Minimum 3GB RAM (4GB recommended), 2 vCPUs, 5GB disk space.
    • For GPU acceleration (Whisper.cpp), an NVIDIA GPU and CUDA are required (use latest-cuda Docker image).
  • Resources: Official Docker images and documentation are available.

Highlighted Details

  • Supports both REST API and Model Context Protocol (MCP) for integration with AI agents like n8n.
  • Offers customization for music genre/mood, voice, caption styling, and video orientation.
  • Includes a basic health check and status endpoints for video generation jobs.
  • Provides endpoints to list available voices and music tags.

Maintenance & Community

The project was open-sourced by the AI Agents A-Z YouTube Channel. Contributions are welcome via Pull Requests.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

  • Currently supports English voiceovers only due to Kokoro.js limitations.
  • Background videos are sourced exclusively from Pexels.
  • Does not support stitching user-provided images or videos.
  • Remotion rendering is CPU-intensive; GPU acceleration is limited to Whisper.cpp.
Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
4
Issues (30d)
3
Star History
375 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.