LLM training code for Databricks foundation models
Top 11.6% on sourcepulse
LLM Foundry provides a comprehensive toolkit for training, fine-tuning, evaluating, and deploying Large Language Models (LLMs) using the Composer library and the MosaicML platform. It's designed for researchers and engineers looking to rapidly experiment with LLM techniques, offering support for models ranging from 125M to 70B parameters, including the state-of-the-art DBRX and MPT series.
How It Works
The codebase is structured around modular scripts for data preparation, training, inference, and evaluation. It leverages Composer for efficient distributed training and integrates features like Flash Attention and ALiBi for performance and extended context lengths. The architecture supports customization through a registry system, allowing users to register new models, loggers, and callbacks without forking the repository.
Quick Start & Requirements
git clone
the repo, cd llm-foundry
, and pip install -e ".[gpu]"
.Highlighted Details
Maintenance & Community
The project is actively maintained by Databricks Mosaic. Community support is available via GitHub issues. Contact demo@mosaicml.com
for MosaicML platform inquiries.
Licensing & Compatibility
Model weights and code are licensed under the Databricks Open Source License, permitting both research and commercial use. Some MPT models have specific commercial use restrictions (e.g., MPT-30B-Chat, MPT-7B-8k-Chat).
Limitations & Caveats
Experimental support for AMD GPUs may require package version adjustments. Intel Gaudi support is also experimental and requires a specific branch. The README notes that non-Docker setups are not recommended.
22 hours ago
1 day