Generative-AI  by fnzhan

Survey paper for multimodal image synthesis/editing & visual AIGC

created 3 years ago
758 stars

Top 46.8% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository serves as a comprehensive survey and curated collection of research papers, code, and projects focused on Multimodal Image Synthesis and Editing (MISE) within the Generative AI era. It aims to provide researchers and practitioners with a structured overview of the field, categorized by data modality and model architectures, facilitating a deeper understanding and advancement of visual AI generation techniques.

How It Works

The project organizes a vast landscape of MISE research, classifying papers and associated resources into taxonomies based on data modality (e.g., text, audio) and model architectures (e.g., Diffusion-based, GAN-based, Autoregressive). This structured approach allows users to navigate and discover relevant advancements, understand the evolution of techniques, and identify key contributions in areas like text-to-image synthesis, image editing, and 3D generation.

Quick Start & Requirements

This repository is primarily a curated list of resources, not a runnable software package. No installation or execution commands are provided. The "quick start" involves browsing the categorized links to papers and projects.

Highlighted Details

  • Extensive categorization of over 150 research papers and projects spanning various MISE techniques.
  • Detailed sections on Neural Rendering, Diffusion-based, Autoregressive, GAN-based, and GAN-Inversion methods.
  • Includes links to datasets, text/audio encoding models, and related survey papers.
  • Facilitates contributions via pull requests for new research in the field.

Maintenance & Community

The project is maintained by fnzhan and is open for community contributions through pull requests. The primary paper associated with the survey is "Multimodal Image Synthesis and Editing: The Generative AI Era" by Zhan et al. (TPAMI 2023).

Licensing & Compatibility

The repository itself does not specify a license. The linked papers and projects will have their own respective licenses, which may vary. Users should consult the individual project licenses for compatibility and usage restrictions.

Limitations & Caveats

As a curated list, this repository does not provide executable code or direct access to the models themselves. The "quick start" is informational, requiring users to follow external links to find and potentially run the associated research projects. The sheer volume of listed papers means it's a broad overview rather than a deep dive into any single technique.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.