PyTorch code for masked diffusion model research paper
Top 97.2% on sourcepulse
This repository provides the official PyTorch implementation for "Scaling up Masked Diffusion Models on Text," a research paper exploring the scalability and effectiveness of Masked Diffusion Models (MDMs) in language tasks. It targets researchers and practitioners interested in advancing text generation and understanding beyond traditional autoregressive models, offering competitive performance and unique advantages in bidirectional reasoning and temporal adaptation.
How It Works
The project implements Masked Diffusion Models (MDMs) for text, a probabilistic approach that demonstrates scaling laws comparable to autoregressive models (ARMs) with a smaller compute gap. It introduces unsupervised classifier-free guidance leveraging unpaired data for conditional inference. The architecture is designed to handle bidirectional reasoning and temporal shifts, addressing limitations found in ARMs.
Quick Start & Requirements
pip install lm-eval==0.4.4 numpy==1.25.0 bitsandbytes==0.43.1 openai==0.28 fschat==0.2.34 anthropic
. Conda installation commands are available in CONDA.md
.Highlighted Details
Maintenance & Community
The project is associated with the ICLR2025 paper "Scaling up Masked Diffusion Models on Text." Links to specific model checkpoints and evaluation scripts are provided.
Licensing & Compatibility
The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The setup requires significant data preprocessing and potentially complex environment management (e.g., separate Anaconda environment for FineWeb dataset preprocessing). Specific version requirements for some dependencies might exist.
7 months ago
1 day