Open Chat Video Editor is an open-source tool for automatically generating and editing short videos from text or web content. It targets users who want to quickly create engaging video content, leveraging large language models and diffusion models for text and visual generation. The primary benefit is the automation of video creation, including voiceovers, background music, and subtitles, from simple inputs.
How It Works
The tool integrates multiple AI models to achieve its functionality. For text generation, it supports models like ChatGPT to create video scripts from short text prompts or web page content. For visual generation, it offers several modes: image retrieval, image generation using Stable Diffusion, a combination of retrieval and generation, and video retrieval. This multi-modal approach allows for diverse video outputs, from simple slideshows to AI-generated scenes.
Quick Start & Requirements
- Installation: Supports Docker, Linux (CentOS tested), and Windows.
- Prerequisites:
- Docker: Requires ~24GB storage; GPU support may vary with CUDA versions.
- Linux: Python 3.8+, GCC 8.5.0+, ImageMagick, specific development libraries.
- Windows: Python 3.8.16, PyTorch (GPU/CPU), CLIP, FAISS (CPU).
- ChatGPT API key and Organization ID are required for text generation.
- Data: Requires downloading data indexes and meta information to
data/index
.
- Configuration: Select and modify YAML configuration files based on the desired input and generation modes.
- Execution: Run via
app/app.py
with specified functions (Text2VideoEditor
, URL2VideoEditor
) and configuration files.
- Links: GitHub Repository
Highlighted Details
- Supports Text-to-Video and URL-to-Video generation.
- Integrates ChatGPT for text generation and Stable Diffusion for image generation.
- Offers multiple visual generation modes including image/video retrieval and generation.
- Future support for Long Video to Short Video conversion is planned.
Maintenance & Community
- Multiple active Discord/WeChat groups are available for community interaction and support.
Licensing & Compatibility
- The project is licensed for non-commercial, educational, and research purposes only. It explicitly states it cannot be used for commercial purposes or any activities that may harm society.
Limitations & Caveats
- Chinese subtitle display is not currently supported and requires manual configuration of font settings.
- Docker GPU support is not guaranteed due to potential CUDA version mismatches.
- The project relies on external datasets (LAION-5B, webvid-10m) for which it does not hold copyright.