auto-video-generateor  by kuangdd2024

Automatic video generator from a topic

Created 1 year ago
776 stars

Top 45.1% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an automated video generation system that takes a topic as input and produces a narrated explainer video. It's designed for users who want to quickly create educational or informational content without manual editing, leveraging LLMs for scriptwriting, TTS for narration, and text-to-image models for visuals.

How It Works

The system orchestrates a multi-stage pipeline: first, a large language model (LLM) generates a story or narration based on the user-provided topic. This text is then segmented, and a text-to-speech (TTS) engine converts it into audio. Concurrently, a text-to-image model generates visuals that match the narrative content. Finally, these audio and image assets are combined to produce the final explainer video. The project emphasizes a "free" approach, utilizing open-source or free-tier services where possible.

Quick Start & Requirements

  • Run via python main.py <parameter> (e.g., python main.py 4).
  • Access the Gradio interface at http://127.0.0.1:8000/.
  • Dependencies include Python, re, pyttsx3, pillow, and moviepy. Specific LLM and TTS/Text-to-Image services may require API keys or specific configurations (e.g., Baidu Qianfan).
  • A demo is available at http://avg.kddbot.com/.

Highlighted Details

  • Supports a "free" generation mode using free resources.
  • Offers a "review and generate" mode for editing text, audio, and image assets before final video creation.
  • Integrates with Baidu Qianfan for LLM (ERNIE series) and image generation.
  • Utilizes edge-tts as a free alternative for TTS.
  • Saves generated multimedia materials in a structured directory.

Maintenance & Community

The project has a significant number of stars on GitHub, indicating community interest. Links to Baidu Cloud documentation for their models are provided. There is a WeChat public account "趣聊机器人" for experiencing the demo.

Licensing & Compatibility

The README does not explicitly state a license. The project utilizes various components, some of which may have their own licenses (e.g., Baidu Qianfan services, edge-tts). Compatibility for commercial use would depend on the licenses of the underlying AI services and libraries used.

Limitations & Caveats

Video generation speed can be slow, and users may need to wait or use "load parameters" and "load resources" to access pre-generated content. Issues with text-based images might cause backend generation errors. The "review and generate" workflow requires completing a full generation pass first, including "create record," before editing individual assets. Some features like batch downloading and subtitle integration are still in the To-Do list.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
18 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Jiaming Song Jiaming Song(Chief Scientist at Luma AI).

MoneyPrinterTurbo by harry0703

0.4%
40k
AI tool for one-click short video generation from text prompts
Created 1 year ago
Updated 3 months ago
Feedback? Help us improve.