ai-fusion-video  by Stonewuu

Full-process AI video creation platform

Created 1 month ago
489 stars

Top 62.9% on SourcePulse

GitHubView on GitHub
Project Summary

Summary Rongguang is an Agent-based, full-process AI video creation platform for content creators. It automates video production from script to final clips by breaking down scripts into storyboards, generating visuals via multiple AI models, and producing video segments, aiming for enhanced efficiency and intelligence.

How It Works The platform uses an Agent Pipeline. AI agents parse user scripts into visual storyboards with scene descriptions. Integrated AI image generation engines (OpenAI, Gemini, Claude, etc.) create reference visuals per panel. AI video generation models then produce clips based on storyboards and images. The system manages assets and orchestrates AI services. Backend: Java 21/Spring Boot 3.5/Spring AI; Frontend: Next.js 16/React 19/TypeScript.

Quick Start & Requirements

  • Docker: Clone repo, configure .env (optional), run docker compose up -d. Access http://localhost:8080.
  • Source: Requires JDK 21+, Node.js 20+, pnpm 9+, Docker. Start middleware (docker compose -f docker-compose-middleware.yml up -d), backend (./mvnw spring-boot:run), frontend (cd ai-fusion-video-web && pnpm install && pnpm dev). Access frontend http://localhost:3000, backend API http://localhost:18080.
  • Prerequisites: Java 21+, Node.js 20+, pnpm 9+, Docker, MySQL, Redis. AI model API keys or local Ollama setup may be needed.

Highlighted Details

  • Broad AI Model Support: Integrates OpenAI, Anthropic, Google, Tongyi Qianwen, DeepSeek, and local Ollama.
  • End-to-End AI Video Pipeline: Automates script-to-storyboard-to-image-to-video generation.
  • Flexible Storage: Supports local and S3-compatible object storage (OSS, COS, MinIO).
  • Agent Pipeline Visualization: Visual interface for AI workflow orchestration.

Maintenance & Community Maintained by Stonewuu. No specific community channels or detailed roadmap beyond listed TODOs.

Licensing & Compatibility MIT License, permitting commercial use and modification with attribution.

Limitations & Caveats Lacks team management and multi-user collaboration features (planned). Advanced features like a "Global intelligent Agent" are pending. Current focus is on individual creator workflows.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
7
Issues (30d)
5
Star History
492 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.