mmagic  by open-mmlab

AIGC toolbox for image/video editing and generation

Created 6 years ago
7,411 stars

Top 7.0% on SourcePulse

GitHubView on GitHub
Project Summary

MMagic is an advanced AIGC toolkit for multimodal creation, offering a comprehensive suite of state-of-the-art generative models for image and video synthesis, editing, and restoration. It targets researchers and AIGC enthusiasts seeking a flexible and powerful platform for tasks like text-to-image generation, image/video enhancement, and 3D-aware generation.

How It Works

MMagic is built upon the OpenMMLab 2.0 framework, leveraging MMEngine and MMCV for a modular and efficient design. It supports a wide array of generative models, including diffusion models (Stable Diffusion, ControlNet, DreamBooth) and GANs (StyleGAN, BigGAN), enabling flexible experimentation and customization through a Lego-like component-based approach. This architecture facilitates easy integration of new algorithms and supports distributed training for dynamic architectures.

Quick Start & Requirements

Highlighted Details

  • Supports 11 new models across 4 new tasks, including Text2Image (ControlNet, DreamBooth, Stable Diffusion), 3D-aware Generation (EG3D), Image Restoration (NAFNet, Restormer), and Image Colorization.
  • Offers "magic" diffusion features like Stable Diffusion/Disco Diffusion support, finetuning (DreamBooth, LoRA), ControlNet integration, xFormers acceleration, and MultiFrame Render for video generation.
  • Upgraded framework with MMEngine/MMCV 2.0, supporting unified data formats, flexible evaluation loops for various metrics, and visualization via TensorBoard/WandB.
  • Extensive Model Zoo covering GANs, Image Restoration, Super-Resolution, Video tasks, Inpainting, Matting, Text-to-Image, and 3D-aware generation.

Maintenance & Community

  • Actively maintained with recent releases (v1.2.0 in Dec 2023) and community contributions.
  • Part of the larger OpenMMLab ecosystem.
  • Issue reporting and ongoing projects are tracked on GitHub.

Licensing & Compatibility

  • Released under the Apache 2.0 license.
  • Permissive for commercial use, but users are advised to review LICENSES for specifics.

Limitations & Caveats

The project requires specific versions of PyTorch and Python, and installation involves multiple steps using MIM. While comprehensive, the vast number of models and features may present a learning curve for new users.

Health Check
Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
24 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind).

RPG-DiffusionMaster by YangLing0818

0%
2k
Training-free paradigm for text-to-image generation/editing
Created 2 years ago
Updated 1 year ago
Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Rodrigo Nader Rodrigo Nader(Cofounder of Langflow), and
1 more.

DiffSynth-Studio by modelscope

0.4%
12k
Open-source project for diffusion model exploration
Created 2 years ago
Updated 3 days ago
Feedback? Help us improve.