Awesome-Unified-Multimodal-Models  by showlab

Paper list for unified multimodal models

created 11 months ago
643 stars

Top 52.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository curates papers, code, and resources for unified multimodal models, often termed "Any-to-Any" generation. It targets researchers and developers working on integrating multimodal understanding and generation tasks into single frameworks, offering a centralized hub for advancements in this rapidly evolving field.

How It Works

Unified multimodal models aim to bridge the gap between traditional separate models for multimodal understanding and generation. They operate on a principle of processing and generating content across various modalities (text, image, audio, video, etc.) within a single, cohesive framework, enabling seamless interaction and creation across different data types.

Quick Start & Requirements

This repository is a curated list of research papers and associated code. There is no direct installation or execution command. Requirements are dependent on the individual projects linked within the list.

Highlighted Details

  • Comprehensive listing of recent (late 2023 - early 2025) unified multimodal models.
  • Covers a wide range of modalities including vision, language, audio, video, and motion.
  • Includes models focusing on various architectural approaches like diffusion, autoregression, and state space models.
  • Provides links to arXiv preprints and some conference publications.

Maintenance & Community

This project is ongoing and welcomes pull requests for suggestions, new papers, or corrections. Contributions can be made by editing and submitting a pull request, or by opening an issue. Users are encouraged to star the repository if they find it useful.

Licensing & Compatibility

The repository itself is not software and does not have a license. The licensing and compatibility of individual models and codebases listed within the repository will vary and must be checked on a per-project basis.

Limitations & Caveats

This is a curated list of research papers and not a runnable software project. The "code" mentioned refers to external repositories, which may have their own dependencies, licenses, and maintenance statuses. The list is actively growing, and some entries may represent very recent or experimental work.

Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
1
Star History
113 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.