Chinese-Mixtral by ymcui

Chinese Mixtral: MoE LLMs for Chinese language

Created 2 years ago

609 stars

Top 53.9% on SourcePulse

Project Summary

This project provides Chinese-language versions of Mistral AI's Mixtral models, specifically a base model (Chinese-Mixtral) and an instruction-tuned variant (Chinese-Mixtral-Instruct). It targets researchers and developers needing high-performance LLMs for Chinese text processing, offering significant improvements in long-context understanding, mathematical reasoning, and code generation, with efficient deployment options.

How It Works

The project leverages Mistral AI's sparse Mixture-of-Experts (MoE) architecture, activating 2 out of 8 experts per token. Chinese-Mixtral is an incremental pre-training of Mixtral-8x7B-v0.1 on large-scale unlabeled Chinese data. Chinese-Mixtral-Instruct is further fine-tuned on instruction datasets. This MoE approach allows for a large effective parameter count (13B active) while maintaining efficient inference, and the models natively support 32K context, extendable to 128K.

Quick Start & Requirements

Installation: Download pre-trained models (full, LoRA, or GGUF formats) from Hugging Face, ModelScope, or Baidu. Deployment via llama.cpp, transformers, vLLM, text-generation-webui, etc.
Prerequisites: Varies by deployment method. llama.cpp requires minimal resources (16GB RAM/VRAM for quantized models). GPU acceleration is recommended for optimal performance.
Resources: Full models are ~87GB, LoRA ~2.4GB. Quantized versions (GGUF) significantly reduce memory footprint.
Docs: Docs/文档, Model Arena

Highlighted Details

Native 32K context support, tested up to 128K.
Quantized versions (e.g., Q4_0) require as little as 16GB RAM/VRAM for inference via llama.cpp.
Achieves competitive scores on Chinese benchmarks like C-Eval and CMMLU, and performs well on LongBench for long-context tasks.
Provides training and fine-tuning scripts for users to adapt or further train the models.

Maintenance & Community

Active development with regular updates (e.g., GGUF quantization, API deployment).
Community support via GitHub Issues and Discussions.
Related projects include Chinese-LLaMA-Alpaca series.

Licensing & Compatibility

Based on Mistral.ai's Mixtral model; users must adhere to its license.
Third-party code licenses must also be followed.
Commercial use requires adherence to local laws and ensuring output compliance; no liability is assumed by the project.

Limitations & Caveats

The project relies on the base Mixtral model, inheriting its architectural characteristics. Model output accuracy is not guaranteed due to computational factors, randomness, and quantization. Users are responsible for ensuring the compliance of model outputs for commercial applications.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days