World-Simulator  by ALEEEHU

Multimodal generative model resources

Created 1 year ago
294 stars

Top 89.9% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive survey and collection of resources for multimodal generative models, focusing on simulating the real world across 2D, video, 3D, and 4D representations. It aims to unify the study of these modalities and track advancements in Text-to-X (Text2X) generation, serving researchers and practitioners in AGI and generative AI.

How It Works

The repository is structured into two main parts: a survey paper that systematically unifies the study of 2D, video, 3D, and 4D generation within a single framework, and an extensive collection of "Awesome Text2X Resources." This collection curates papers, code, and datasets for state-of-the-art Text-to-X methods, covering a wide range of generative tasks.

Quick Start & Requirements

This repository is a curated collection of research papers and resources, not a runnable software package. Installation or execution instructions are not applicable.

Highlighted Details

  • Unified Framework: The survey paper offers a novel approach to systematically unify the study of 2D, video, 3D, and 4D generation, bridging the gap between isolated modality research.
  • Extensive Resource Curation: The "Awesome Text2X Resources" section is a continuously updated, community-driven collection of papers, code, and datasets across numerous generative tasks.
  • Broad Coverage: The repository covers a wide spectrum of generative AI, from foundational 2D image generation to complex 4D scene simulation and human motion synthesis.
  • Future Directions: The survey paper includes insights and future directions to guide ongoing research in multimodal generative models.

Maintenance & Community

The repository is actively maintained, with frequent updates to include the latest research papers and accepted conference works. Community contributions via pull requests or issues are encouraged.

Licensing & Compatibility

The repository is released under the MIT license, allowing for broad use and adaptation of the curated information.

Limitations & Caveats

As a curated list of research, this repository does not provide executable code or pre-trained models for direct use. Users must refer to individual linked projects for implementation details and requirements.

Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
6 more.

threestudio by threestudio-project

0.2%
7k
Framework for 3D content generation from text/images using 2D diffusion
Created 2 years ago
Updated 9 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luca Antiga Luca Antiga(CTO of Lightning AI), and
2 more.

mmagic by open-mmlab

0.1%
7k
AIGC toolbox for image/video editing and generation
Created 6 years ago
Updated 1 year ago
Feedback? Help us improve.