Large language model for research/commercial use
Top 18.6% on sourcepulse
DBRX is an open-source Mixture-of-Experts (MoE) large language model developed by Databricks, offering 132B total parameters with 36B active parameters. It is designed for researchers and commercial entities, providing pre-trained and instruction-following variants with a 32K token context length, aiming to deliver high-quality LLM capabilities.
How It Works
DBRX utilizes a Mixture-of-Experts (MoE) architecture with 16 experts, activating 4 per inference pass. This approach allows for a larger total parameter count while maintaining computational efficiency during inference. The model was trained on 12 trillion tokens using Databricks' open-source libraries: Composer, LLM Foundry, and MegaBlocks, optimized for performance and scalability.
Quick Start & Requirements
pip install -r requirements.txt
or requirements-gpu.txt
.huggingface-cli login
.python generate.py
.mosaicml/llm-foundry:2.2.1_cu121_flash2-latest
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
Inactive