World-Simulator by ALEEEHU

Multimodal generative model resources

Created 2 years ago

379 stars

Top 74.8% on SourcePulse

Project Summary

This repository provides a comprehensive survey and collection of resources for multimodal generative models, focusing on simulating the real world across 2D, video, 3D, and 4D representations. It aims to unify the study of these modalities and track advancements in Text-to-X (Text2X) generation, serving researchers and practitioners in AGI and generative AI.

How It Works

The repository is structured into two main parts: a survey paper that systematically unifies the study of 2D, video, 3D, and 4D generation within a single framework, and an extensive collection of "Awesome Text2X Resources." This collection curates papers, code, and datasets for state-of-the-art Text-to-X methods, covering a wide range of generative tasks.

Quick Start & Requirements

This repository is a curated collection of research papers and resources, not a runnable software package. Installation or execution instructions are not applicable.

Highlighted Details

Unified Framework: The survey paper offers a novel approach to systematically unify the study of 2D, video, 3D, and 4D generation, bridging the gap between isolated modality research.
Extensive Resource Curation: The "Awesome Text2X Resources" section is a continuously updated, community-driven collection of papers, code, and datasets across numerous generative tasks.
Broad Coverage: The repository covers a wide spectrum of generative AI, from foundational 2D image generation to complex 4D scene simulation and human motion synthesis.
Future Directions: The survey paper includes insights and future directions to guide ongoing research in multimodal generative models.

Maintenance & Community

The repository is actively maintained, with frequent updates to include the latest research papers and accepted conference works. Community contributions via pull requests or issues are encouraged.

Licensing & Compatibility

The repository is released under the MIT license, allowing for broad use and adaptation of the curated information.

Limitations & Caveats

As a curated list of research, this repository does not provide executable code or pre-trained models for direct use. Users must refer to individual linked projects for implementation details and requirements.

World-Simulator by ALEEEHU

Explore Similar Projects

Awesome-3D-AIGC by mdyao

Autoregressive-Models-in-Vision-Survey by ChaofanTao

awesome-conditional-content-generation by haofanwang

Awesome-AIGC-3D by hitcslj

Awesome-Text-to-3D by yyeboah

awesome-video-generation by AlonzoLeeeooo

Generative-AI by fnzhan

T2M-GPT by Mael-zys

Awesome-Text-to-Image by Yutong-Zhou-cv

Bagel by ByteDance-Seed

threestudio by threestudio-project

mmagic by open-mmlab