MixtralKit by open-compass

Toolkit for inference/evaluation of Mistral AI's 'mixtral-8x7b-32kseqlen'

Created 2 years ago

773 stars

Top 45.3% on SourcePulse

View on GitHub

2 Experts Love This Project

Pankaj Gupta

Cofounder of Baseten

Awni Hannun

Author of MLX; Research Scientist at Apple

Project Summary

MixtralKit provides a toolkit for efficient inference and evaluation of the Mixtral-8x7B-32Kseqlen model. It is designed for researchers and developers working with large language models, offering a streamlined way to deploy and benchmark this specific Mixture-of-Experts (MoE) architecture.

How It Works

MixtralKit leverages a Mixture-of-Experts (MoE) architecture, where the Feed-Forward Network (FFN) layer in standard transformer blocks is replaced by an MoE FFN. This MoE FFN uses a gating layer to select the top-k out of 8 experts for each token, enabling sparse activation and potentially more efficient computation. The model utilizes RMSNorm, similar to LLaMA, and features specific QKV matrix shapes for its attention layers.

Quick Start & Requirements

Install: Use conda create --name mixtralkit python=3.10 pytorch torchvision pytorch-cuda -c nvidia -c pytorch -y, activate the environment, clone the repository, and run pip install -r requirements.txt followed by pip install -e ..
Prerequisites: Python 3.10, PyTorch with CUDA support. Requires downloading model checkpoints (available via Hugging Face or magnet link).
Setup: Link to checkpoints folder using ln -s path/to/checkpoints_folder/ ckpts.
Inference: Run python tools/example.py -m ./ckpts -t ckpts/tokenizer.model --num-gpus 2.
Evaluation: Requires cloning and installing OpenCompass, downloading datasets, and linking model weights and MixtralKit's playground scripts within the OpenCompass directory.

Highlighted Details

Provides performance benchmarks against other leading LLMs on various datasets.
Supports inference using vLLM for efficient deployment.
Includes fine-tuning scripts (Full-parameters or QLoRA) via XTuner.
Offers detailed information on MoE architecture and related research papers.

Maintenance & Community

The project is associated with the OpenCompass initiative. Links to relevant resources like MoE blogs and papers are provided.

Licensing & Compatibility

The repository is licensed under Apache 2.0, allowing for commercial use and integration with closed-source projects.

Limitations & Caveats

This is described as an experimental implementation. The README focuses on Mixtral-8x7B-32Kseqlen, and compatibility with other Mixtral variants or models is not explicitly stated. Evaluation setup requires a separate installation of OpenCompass.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days