Training and inference code for BLOOMChat, a 176B multilingual chat model
Top 56.3% on sourcepulse
BLOOMChat provides the code and methodology for instruction-tuning the 176 billion parameter BLOOM model into a multilingual conversational AI. It is targeted at researchers and developers interested in replicating or adapting large-scale chat model training and deployment. The project offers a path to a powerful, open-source conversational agent.
How It Works
BLOOMChat is instruction-tuned from the base BLOOM model using a curated mix of conversational datasets, including OpenChatKit's OIG, Dolly 2.0, and OASST1. The training process, detailed in the training
directory, was performed on SambaNova DataScale systems leveraging their proprietary Reconfigurable Dataflow Architecture (RDU). While the training code is specific to their hardware, the inference code is adapted for standard GPU setups using Hugging Face's transformers-bloom-inference
repository.
Quick Start & Requirements
huggingface/transformers-bloom-inference
, install dependencies via pipenv
, and modify specific files (hf_accelerate.py
, cli.py
) as per the README.pipenv
, deepspeed
(for inference), and potentially multiple A100 GPUs (80GB recommended) for efficient inference.Highlighted Details
transformers-bloom-inference
.Maintenance & Community
The project acknowledges contributions from SambaNova Systems and Together Computer. Further details on community engagement or roadmap are not explicitly provided in the README.
Licensing & Compatibility
The model weights are available via Hugging Face. The code repository itself does not explicitly state a license, but it is associated with SambaNova Systems. Compatibility for commercial use or closed-source linking would require clarification on the licensing terms of the code and model weights.
Limitations & Caveats
The training code is specific to SambaNova's RDU hardware and is not directly usable on standard GPUs. The README notes that dataset reproduction might not be 100% reproducible with the current OIG dataset from OpenChatKit, with updates promised. Inference with int8 quantization is noted as suboptimal compared to bf16.
1 year ago
1+ week