Awesome-Multimodal-Large-Language-Models by BradyFU

MLLM resource list, covering papers, datasets, and benchmarks

Created 2 years ago

17,154 stars

Top 2.8% on SourcePulse

View on GitHub

8 Experts Love This Project

Jason Huggins

Creator of Selenium

Alex Cheema

Cofounder of EXO Labs

Vincent Weisser

Cofounder of Prime Intellect

Jinze Bai

Research Scientist at Alibaba Qwen

and 4 more!

Project Summary

This repository serves as a comprehensive, curated list of recent advancements in Multimodal Large Language Models (MLLMs). It aims to provide researchers and practitioners with an up-to-date overview of papers, datasets, and benchmarks in this rapidly evolving field.

How It Works

The repository categorizes MLLM research into key areas such as multimodal instruction tuning, hallucination mitigation, in-context learning, chain-of-thought reasoning, LLM-aided visual reasoning, foundation models, and evaluation benchmarks. It meticulously lists relevant papers with links to their arXiv pages and GitHub repositories, alongside datasets and evaluation metrics.

Quick Start & Requirements

This repository is a curated list and does not require installation or specific software. It serves as a reference guide.

Highlighted Details

Extensive coverage of recent MLLM papers, including those from 2025.
Detailed categorization of research topics, facilitating targeted exploration.
Inclusion of numerous datasets for pre-training, instruction tuning, and evaluation.
Comprehensive list of evaluation benchmarks for assessing MLLM performance.

Maintenance & Community

The repository is actively maintained, with frequent updates reflecting the latest research. Community contributions are encouraged.

Licensing & Compatibility

The repository itself is a collection of links and information, and does not impose a specific license on its content. Individual linked papers and code repositories will have their own respective licenses.

Limitations & Caveats

As a rapidly updated list, some entries may link to pre-print servers (arXiv) and may not have undergone peer review. The sheer volume of MLLM research means that while comprehensive, it may not capture every single emerging paper.

Health Check

Last Commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

221 stars in the last 30 days