CVPR 2025 paper implementation for customized manga generation
Top 43.6% on sourcepulse
DiffSensei enables customized black-and-white manga generation by bridging multi-modal large language models (LLMs) and diffusion models. It allows users to generate varied-resolution manga panels with flexible character adaptation from a single input image, targeting researchers and artists interested in controllable AI-powered comic creation.
How It Works
DiffSensei employs a diffusion model architecture enhanced with an IP-Adapter for character consistency and an LLM for text-to-image conditioning. This approach allows for precise control over character appearance across different panels and supports flexible text prompts, enabling the generation of diverse manga scenes with specific character traits.
Quick Start & Requirements
diffusers
, transformers
, accelerate
, and xformers
.gradio
or gradio_wo_mllm
).Highlighted Details
Maintenance & Community
The project is associated with CVPR 2025 and has released checkpoints, datasets, and inference code. Further community engagement details are not explicitly provided in the README.
Licensing & Compatibility
The project's license is not explicitly stated in the README. The MangaZero dataset is provided via URLs and annotations due to potential licensing issues with direct image sharing.
Limitations & Caveats
The MangaZero dataset is a partial release (3/4 of the full dataset) due to unavailable image URLs. The provided reference training code is still in a testing phase and may require adjustments for specific datasets and requirements.
5 months ago
1 day