mmagic  by open-mmlab

AIGC toolbox for image/video editing and generation

Created 6 years ago
7,358 stars

Top 6.9% on SourcePulse

GitHubView on GitHub
Project Summary

MMagic is an advanced AIGC toolkit for multimodal creation, offering a comprehensive suite of state-of-the-art generative models for image and video synthesis, editing, and restoration. It targets researchers and AIGC enthusiasts seeking a flexible and powerful platform for tasks like text-to-image generation, image/video enhancement, and 3D-aware generation.

How It Works

MMagic is built upon the OpenMMLab 2.0 framework, leveraging MMEngine and MMCV for a modular and efficient design. It supports a wide array of generative models, including diffusion models (Stable Diffusion, ControlNet, DreamBooth) and GANs (StyleGAN, BigGAN), enabling flexible experimentation and customization through a Lego-like component-based approach. This architecture facilitates easy integration of new algorithms and supports distributed training for dynamic architectures.

Quick Start & Requirements

Highlighted Details

  • Supports 11 new models across 4 new tasks, including Text2Image (ControlNet, DreamBooth, Stable Diffusion), 3D-aware Generation (EG3D), Image Restoration (NAFNet, Restormer), and Image Colorization.
  • Offers "magic" diffusion features like Stable Diffusion/Disco Diffusion support, finetuning (DreamBooth, LoRA), ControlNet integration, xFormers acceleration, and MultiFrame Render for video generation.
  • Upgraded framework with MMEngine/MMCV 2.0, supporting unified data formats, flexible evaluation loops for various metrics, and visualization via TensorBoard/WandB.
  • Extensive Model Zoo covering GANs, Image Restoration, Super-Resolution, Video tasks, Inpainting, Matting, Text-to-Image, and 3D-aware generation.

Maintenance & Community

  • Actively maintained with recent releases (v1.2.0 in Dec 2023) and community contributions.
  • Part of the larger OpenMMLab ecosystem.
  • Issue reporting and ongoing projects are tracked on GitHub.

Licensing & Compatibility

  • Released under the Apache 2.0 license.
  • Permissive for commercial use, but users are advised to review LICENSES for specifics.

Limitations & Caveats

The project requires specific versions of PyTorch and Python, and installation involves multiple steps using MIM. While comprehensive, the vast number of models and features may present a learning curve for new users.

Health Check
Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
21 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Rodrigo Nader Rodrigo Nader(Cofounder of Langflow), and
1 more.

DiffSynth-Studio by modelscope

0.7%
11k
Open-source project for diffusion model exploration
Created 2 years ago
Updated 3 days ago
Feedback? Help us improve.