Discover and explore top open-source AI tools and projects—updated daily.
Automated AI novel to video workflow
Top 65.2% on SourcePulse
This project provides an automated workflow for generating AI-powered novel promotional videos, transforming raw novel content into engaging video summaries. It targets content creators and enthusiasts looking to streamline the production of promotional material for novels, leveraging multiple AI models for various stages of the pipeline.
How It Works
The workflow orchestrates a series of Python scripts, each responsible for a specific task: fetching novel content, generating scene storyboards using Gemini, refining prompts with DeepSeek, creating images with Stable Diffusion (via aaaki forge), synthesizing audio with CosyVoice2, generating subtitles with Whisper, and finally assembling video clips using FFmpeg with GPU acceleration. This modular approach allows for flexibility and the integration of different AI models at each step.
Quick Start & Requirements
uv
for dependency management. Install with pip install uv
, create a virtual environment with uv venv --python 3.12
, activate it, and install requirements with uv add -r requirements.txt
..env
file is required for configuration.Highlighted Details
main.py
which orchestrates the entire process.Maintenance & Community
No specific information on contributors, sponsorships, or community channels (like Discord/Slack) is provided in the README.
Licensing & Compatibility
The README does not specify a license. Compatibility for commercial use or closed-source linking is not mentioned.
Limitations & Caveats
The project requires specific API keys and may need manual adjustments for high-concurrency Gemini usage. The README does not detail compatibility with different operating systems beyond the implied Linux/macOS shell commands and Windows .venv\Scripts\activate
. The Whisper model size selection impacts VRAM requirements, with larger models needing up to 10GB.
6 months ago
Inactive