Discover and explore top open-source AI tools and projects—updated daily.
kuleshov-groupResearch paper for masked diffusion language model (MDLM)
Top 58.5% on SourcePulse
This repository introduces MDLM, a Masked Diffusion Language Model that achieves state-of-the-art perplexity on large text datasets by simplifying diffusion loss to a mixture of masked language modeling losses. It is designed for researchers and practitioners in natural language processing and generative modeling.
How It Works
MDLM utilizes a novel (SUBS)titution-based parameterization for discrete diffusion models. This approach reformulates the diffusion process, allowing the absorbing state diffusion loss to be expressed as a combination of classical masked language modeling losses. This simplification leads to more efficient training and inference compared to prior diffusion language models.
Quick Start & Requirements
conda env create -f requirements.yaml and conda activate mdlm.mkdir outputs and mkdir watch_folder.sbatch scripts/train_owt_mdlm.sh.kuleshov-group/mdlm-owt.Highlighted Details
ddpm_cache) that is ~3-4x faster than existing diffusion model samplers.Maintenance & Community
The project is associated with the Kuleshov Group at Stanford University. The README mentions an improved implementation is available in the DUO Github repo.
Licensing & Compatibility
The repository does not explicitly state a license. However, the project is presented as a research artifact from NeurIPS 2024, implying a focus on academic use. Commercial use would require clarification.
Limitations & Caveats
The project is presented as a NeurIPS 2024 submission, suggesting it may be research-oriented and potentially subject to ongoing development or refinement. An improved implementation is noted as available elsewhere.
1 month ago
1 day
LuChengTHU
ML-GSAI