lumenx  by alibaba

AI platform for generating animated comic dramas from text

Created 2 months ago
319 stars

Top 85.2% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

LumenX Studio is an AI-native platform for producing short comic dramas (漫剧) from novel text to dynamic video. It targets creators seeking to streamline the complex pipeline of script analysis, asset generation, storyboarding, and video synthesis, offering significant efficiency gains through an integrated AI-powered workflow.

How It Works

LumenX integrates a full Standard Operating Procedure (SOP) for comic drama creation, from asset extraction and style definition to final video synthesis. It leverages Alibaba's Qwen LLM for script analysis and prompt refinement, and Wanx multimodal models for visual and video generation. Key advantages include maintaining visual consistency via character asset generation and AI-assisted prompt polishing, providing a controllable, efficient production experience.

Quick Start & Requirements

Setup requires cloning the repository and configuring Alibaba Cloud API keys (DASHSCOPE_API_KEY). Prerequisites include Python 3.11+, Node.js 18+, and FFmpeg installation. The backend (FastAPI) is started via ./start_backend.sh, and the frontend (Next.js) via npm run dev in the frontend directory. Refer to the User Manual (USER_MANUAL.md) for details. OSS object storage is recommended for production media.

Highlighted Details

  • AI Script Analysis & Storyboarding: LLM-driven extraction of narrative entities and generation of storyboard scripts.
  • Controllable Art Direction: Custom visual styles defined via prompt engineering ensure stylistic consistency.
  • Visual Storyboarding Editor: Drag-and-drop interface for composing scenes.
  • Multi-Modal Generation: Integrates Tongyi Wanxiang (Wanx) for text-to-image and image-to-video synthesis.
  • Automated Audio-Visual Synthesis: Generates character voiceovers (TTS) and sound effects (SFX), culminating in final video assembly.
  • Character Consistency: Employs techniques like generating base character assets for visual coherence.

Maintenance & Community

Led by 星莲 (StarLotus), the project encourages community interaction via GitHub Issues (bugs) and Discussions (features). Contribution guidelines are in CONTRIBUTING.md. Contact: zhangjunhe.zjh@alibaba-inc.com.

Licensing & Compatibility

Released under the permissive MIT License, allowing broad usage, modification, and distribution, including integration within commercial and closed-source projects.

Limitations & Caveats

The platform relies on Alibaba Cloud services, requiring API key configuration and potentially incurring costs. Stable operation necessitates specific Python (3.11+) and Node.js (18+) versions, plus system-wide FFmpeg, which may complicate setup. No explicit alpha status, known bugs, or unsupported platforms are detailed in the README.

Health Check
Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
7
Issues (30d)
6
Star History
189 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.