Discover and explore top open-source AI tools and projects—updated daily.
ZHZisZZFramework for diffusion language modeling
Top 47.6% on SourcePulse
dLLM is a library designed to unify the training and evaluation of diffusion language models, aiming to enhance transparency and reproducibility across the development pipeline. It targets researchers and engineers working with advanced language models, offering scalable training and streamlined evaluation to simplify the development and deployment of models like LLaDA and Dream, and enabling novel applications such as instruction-tuned BERT chatbots and edit-aware language generation.
How It Works
The library provides scalable training pipelines, drawing inspiration from transformers.Trainer, with robust support for distributed training frameworks like LoRA, DeepSpeed, and FSDP. Its core innovation lies in unified evaluation pipelines, modeled after lm-evaluation-harness, which abstract complex inference details for easier customization and benchmarking. This integrated approach facilitates minimal pretraining, finetuning, and evaluation recipes for open-weight models and implements advanced training algorithms like Edit Flows, enabling researchers to experiment with generative model extensions.
Quick Start & Requirements
Installation involves creating a Python 3.10 Conda environment, installing PyTorch with CUDA 12.4 (other versions may be compatible), and then installing the dLLM package in editable mode (pip install -e .). Optional evaluation setup requires initializing the lm-evaluation-harness submodule and installing its dependencies. Slurm users need to configure scripts/train.slurm.sh for their specific cluster environment.
Highlighted Details
Maintenance & Community
The provided README does not contain specific details regarding maintainers, community channels (e.g., Discord, Slack), or project roadmaps.
Licensing & Compatibility
The README does not specify a software license. This absence requires clarification for any potential adoption, especially concerning commercial use or integration into closed-source projects.
Limitations & Caveats
The README does not explicitly state any limitations, alpha status, known bugs, or unsupported platforms. The EditFlow examples are described as an "educational reference," suggesting a focus on learning and experimentation rather than immediate production deployment.
4 days ago
Inactive
cedrickchee
google-research