DLLM-Survey by LiQiiiii

Survey of discrete diffusion models for LLMs and multimodal applications

Created 8 months ago

365 stars

Top 77.5% on SourcePulse

Project Summary

This repository provides a comprehensive survey of discrete diffusion models applied to large language and multimodal models. It aims to consolidate research in this rapidly evolving field, offering a structured overview of key concepts, techniques, and applications for researchers and practitioners interested in generative AI.

How It Works

The survey categorizes discrete diffusion models based on their core methodologies, including discrete denoising diffusion probabilistic models, reparameterized discrete diffusion models, and concrete score matching. It details various training techniques such as initialization, masking strategies, and addressing training-testing discrepancies, as well as inference techniques like unmasking, remasking, prefilling, and caching. The paper also explores guidance techniques and categorizes applications across text generation, editing, summarization, sentiment analysis, knowledge reasoning, and multimodal tasks.

Quick Start & Requirements

This is a survey paper, not a software repository. No installation or execution is required. The primary resource is the linked arXiv paper.

Highlighted Details

Comprehensive coverage of discrete diffusion models in LLMs and multimodal contexts.
Detailed breakdown of training and inference techniques.
Extensive categorization of applications, from text generation to biological discovery.
Includes a timeline of dLLMs and dMLLMs and links to relevant papers.

Maintenance & Community

The repository is maintained by the authors of the survey paper. Contributions for adding new papers or updating details are welcomed via Pull Requests or email.

Licensing & Compatibility

The content of the repository is for informational purposes. The survey paper itself is likely available under a Creative Commons license or similar, common for academic preprints. Specific licensing for any code snippets or linked resources would depend on their original sources.

Limitations & Caveats

As a survey, this repository does not provide executable code or models. The field is rapidly advancing, and new research may not be immediately reflected. The "2025" in the citation suggests a future publication date, indicating the survey's forward-looking scope.

DLLM-Survey by LiQiiiii

Explore Similar Projects

Awesome-Multimodal-LLM by HenryHZY

ml-papers by rosinality

Awesome-Diffusion-for-Image-Translation by wd1511

LLMGA by JIA-Lab-research

Awesome-Controllable-T2I-Diffusion-Models by PRIV-Creation

Awesome-DLMs by VILA-Lab

Multimodal-AND-Large-Language-Models by Yangyi-Chen

Attend-and-Excite by yuval-alaluf

100-Days-of-NLP by graviraja

one-small-step by karminski

awesome-multimodal-ml by pliang279

Hands-On-Large-Language-Models by HandsOnLLM