openOii  by Xeron2000

AI agent-based comic drama generation platform

Created 2 months ago
255 stars

Top 98.8% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an AI-powered platform for generating comic dramas, transforming user-created stories into complete video productions through a multi-agent collaborative workflow. It targets creators and developers looking to automate and streamline the process of turning creative ideas into animated visual content, offering a benefit of significantly reduced production time and effort.

How It Works

The platform utilizes a sophisticated multi-agent system comprising eight specialized AI Agents, each responsible for a distinct stage of the production pipeline. This includes agents for scriptwriting, character design, storyboard creation, and video generation. The workflow is orchestrated from user input through agents like Director and Scriptwriter, progressing to visual elements via Character Artist and Storyboard Artist, and finally to video output by the Video Generator and Merger agents, with a Review Agent handling user feedback for iterative refinement. This modular approach allows for focused task execution and efficient end-to-end content generation.

Quick Start & Requirements

  • Docker Deployment (Recommended): Use docker-compose up -d with pre-built GitHub images. Requires Docker and Docker Compose. Essential API keys for LLM (Anthropic), Image (OpenAI compatible, ModelScope), and Video (Doubao, OpenAI compatible) services must be configured in backend/.env.example. Access the frontend at http://localhost:15173 and API docs at http://localhost:18765/docs.
  • Local Development: Requires Python 3.10+, Node.js 18+, PostgreSQL 14+, Redis 6+, and FFmpeg 4.0+. Backend setup involves cloning, installing dependencies (uv sync or pip install -e .), configuring .env with API keys, and running with uvicorn. Frontend setup involves navigating to frontend, installing dependencies (pnpm install), and running the dev server (pnpm dev). Access the application at http://localhost:15173.

Highlighted Details

  • Multi-Agent Architecture: Employs 8 specialized AI Agents (e.g., Scriptwriter, Character Artist, Video Generator) for a structured content creation pipeline.
  • Flexible AI Service Integration: Supports various LLM (Claude, Zhipu), Image (OpenAI compatible, ModelScope), and Video (Doubao, OpenAI compatible) providers via configurable API endpoints.
  • Advanced Generation Modes: Offers Text-to-Image/Video and Image-to-Image/Video options, with reference modes for enhanced character consistency.
  • Interactive Workflow: Features a real-time feedback system via WebSocket and an infinite canvas (tldraw) for dynamic project management and content regeneration.

Maintenance & Community

The project welcomes contributions via pull requests and issues. Direct contact is available via Telegram and email. Specific details on active contributors, sponsorships, or community forums like Discord/Slack are not provided in the README.

Licensing & Compatibility

The project is released under the MIT License, which is permissive and generally allows for commercial use and integration into closed-source projects.

Limitations & Caveats

  • API Key Management: Requires configuration of multiple third-party API keys (Anthropic, OpenAI, Doubao, etc.), potentially incurring costs and setup complexity.
  • Image Consistency with ModelScope: When using ModelScope for image generation, character consistency may be reduced due to the lack of a dedicated image-to-image model.
  • Active Development: While functional, the project appears to be under active development, with features like the infinite canvas recently added.
Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
61 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.