auto-video-generateor  by kuangdd2024

Automatic video generator from a topic

created 11 months ago
751 stars

Top 47.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides an automated video generation system that takes a topic as input and produces a narrated explainer video. It's designed for users who want to quickly create educational or informational content without manual editing, leveraging LLMs for scriptwriting, TTS for narration, and text-to-image models for visuals.

How It Works

The system orchestrates a multi-stage pipeline: first, a large language model (LLM) generates a story or narration based on the user-provided topic. This text is then segmented, and a text-to-speech (TTS) engine converts it into audio. Concurrently, a text-to-image model generates visuals that match the narrative content. Finally, these audio and image assets are combined to produce the final explainer video. The project emphasizes a "free" approach, utilizing open-source or free-tier services where possible.

Quick Start & Requirements

  • Run via python main.py <parameter> (e.g., python main.py 4).
  • Access the Gradio interface at http://127.0.0.1:8000/.
  • Dependencies include Python, re, pyttsx3, pillow, and moviepy. Specific LLM and TTS/Text-to-Image services may require API keys or specific configurations (e.g., Baidu Qianfan).
  • A demo is available at http://avg.kddbot.com/.

Highlighted Details

  • Supports a "free" generation mode using free resources.
  • Offers a "review and generate" mode for editing text, audio, and image assets before final video creation.
  • Integrates with Baidu Qianfan for LLM (ERNIE series) and image generation.
  • Utilizes edge-tts as a free alternative for TTS.
  • Saves generated multimedia materials in a structured directory.

Maintenance & Community

The project has a significant number of stars on GitHub, indicating community interest. Links to Baidu Cloud documentation for their models are provided. There is a WeChat public account "趣聊机器人" for experiencing the demo.

Licensing & Compatibility

The README does not explicitly state a license. The project utilizes various components, some of which may have their own licenses (e.g., Baidu Qianfan services, edge-tts). Compatibility for commercial use would depend on the licenses of the underlying AI services and libraries used.

Limitations & Caveats

Video generation speed can be slow, and users may need to wait or use "load parameters" and "load resources" to access pre-generated content. Issues with text-based images might cause backend generation errors. The "review and generate" workflow requires completing a full generation pass first, including "create record," before editing individual assets. Some features like batch downloading and subtitle integration are still in the To-Do list.

Health Check
Last commit

8 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
70 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.