awesome-self-supervised-multimodal-learning by ys-zong

Curated list of self-supervised multimodal learning resources

Created 2 years ago

273 stars

Top 94.7% on SourcePulse

Project Summary

This repository is a curated list of resources for self-supervised multimodal learning (SSML), targeting researchers and practitioners in AI. It provides a comprehensive overview of SSML, addressing challenges like learning from unlabeled multimodal data, fusing different modalities, and handling unaligned data, with the goal of advancing AI capabilities beyond supervised learning limitations.

How It Works

The repository categorizes SSML approaches into key learning paradigms: Instance Discrimination (contrastive or matching prediction to align representations), Clustering (using pseudo-labels from iterative clustering for supervision), and Masked Prediction (auto-encoding or auto-regressive tasks). It also details hybrid methods and applications across domains like healthcare, remote sensing, and autonomous driving.

Quick Start & Requirements

This is a curated list, not a runnable codebase. It requires no installation. The primary value is in the comprehensive collection of papers, datasets, and code repositories related to SSML.

Highlighted Details

Extensive taxonomy of SSML objectives, architectures, and applications.
Detailed summaries of common multimodal datasets (Image-Text, Video-Text, Audio-Video, Point Cloud).
Links to over 100 research papers with associated code where available.
Covers challenges such as robustness, fairness, and data extraction from models.

Maintenance & Community

The project is maintained by Yongshuo Zong, Oisin Mac Aodha, and Timothy Hospedales, authors of the associated survey paper. Contributions are welcomed via Pull Requests.

Licensing & Compatibility

The repository itself is a list of links and does not have a specific license. Individual linked papers and code repositories will have their own licenses, which must be checked for compatibility with commercial or closed-source use.

Limitations & Caveats

As a curated list, it does not provide runnable code or pre-trained models. The rapidly evolving nature of SSML means the list may not be exhaustive or perfectly up-to-date without ongoing community contributions.

awesome-self-supervised-multimodal-learning by ys-zong

Explore Similar Projects

Open-Qwen2VL by Victorwz

GroundingGPT by lzw-lzw

unicom by deepglint

Awesome-Open-Vocabulary-Semantic-Segmentation by Qinying-Liu

Awesome-CV-Foundational-Models by awaisrauf

Awesome-Self-Supervised-Papers by dev-sungman

ibot by bytedance

awesome-self-supervised-gnn by ChandlerBang

AutoDL by DeepWisdom

SemanticSegmentation_DL by tangzhenyu

awesome-self-supervised-learning by jason718

LAVIS by salesforce