dbrx  by databricks

Large language model for research/commercial use

Created 1 year ago
2,571 stars

Top 18.2% on SourcePulse

GitHubView on GitHub
Project Summary

DBRX is an open-source Mixture-of-Experts (MoE) large language model developed by Databricks, offering 132B total parameters with 36B active parameters. It is designed for researchers and commercial entities, providing pre-trained and instruction-following variants with a 32K token context length, aiming to deliver high-quality LLM capabilities.

How It Works

DBRX utilizes a Mixture-of-Experts (MoE) architecture with 16 experts, activating 4 per inference pass. This approach allows for a larger total parameter count while maintaining computational efficiency during inference. The model was trained on 12 trillion tokens using Databricks' open-source libraries: Composer, LLM Foundry, and MegaBlocks, optimized for performance and scalability.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt or requirements-gpu.txt.
  • Authenticate: huggingface-cli login.
  • Run inference: python generate.py.
  • Recommended: 320GB memory, NVIDIA A100/H100 GPUs for optimized inference.
  • Docker image available: mosaicml/llm-foundry:2.2.1_cu121_flash2-latest.
  • Official Hugging Face page for weights and tokenizer: https://huggingface.co/collections/databricks/
  • LLM Foundry for advanced usage: https://github.com/mosaicml/llm-foundry

Highlighted Details

  • Mixture-of-Experts (MoE) architecture with 132B total / 36B active parameters.
  • 32K token context length.
  • Optimized inference supported via TensorRT-LLM and vLLM.
  • Quantized versions available for MLX (Apple Silicon) and llama.cpp.
  • Full parameter and LoRA finetuning supported via LLM Foundry.

Maintenance & Community

  • Model weights and code are available on Hugging Face.
  • Issues related to model output should be directed to Hugging Face community forums.
  • Issues with training libraries should be opened on the respective GitHub repositories.

Licensing & Compatibility

  • Databricks Open Source License: Permissive for both research and commercial use.
  • Acceptable Use Policy applies.

Limitations & Caveats

  • Access to the Base model requires manual approval.
  • LoRA finetuning currently does not support finetuning experts as they are fused.
Health Check
Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.