huozi by HIT-SCIR

LLM for research, focused on Chinese language understanding and generation

Created 2 years ago

390 stars

Top 73.6% on SourcePulse

Project Summary

HIT-SCIR/huozi provides the Huozi series of large language models, designed for research and practical applications in natural language processing. The latest version, Huozi 3.5, offers enhanced performance in Chinese and English knowledge, mathematical reasoning, code generation, and instruction following, targeting researchers and developers working with LLMs.

How It Works

Huozi 3.5 is a Sparse Mixture-of-Experts (SMoE) model with 46.7B total parameters, activating only 13B during inference for efficiency. Its development involved several stages: incremental pre-training on a Chinese vocabulary extension for Mixtral-8x7B, instruction fine-tuning to create Huozi 3.0, further fine-tuning with a proprietary dataset and BPE Dropout for instruction following, model fusion, and final instruction fine-tuning for Huozi 3.5. This multi-stage approach aims to balance broad knowledge with strong task-specific capabilities and safety.

Quick Start & Requirements

Install/Run: Use Hugging Face transformers or modelscope libraries. Example Python code provided for inference.
Prerequisites: PyTorch, transformers, modelscope, vLLM (for accelerated inference). GPU with sufficient VRAM (e.g., 88GB for full model, less for quantized versions) is recommended. CUDA support is beneficial.
Resources: Model weights are large (88GB+). Quantized versions (e.g., GGUF with q2_k) can reduce VRAM requirements significantly (tested down to 9.6GB for 16 layers offloaded).
Links: Hugging Face ModelScope, ModelScope, vLLM Inference, OpenAI API Deployment, llama.cpp GGUF

Highlighted Details

Supports 32K context length.
Compatible with various inference frameworks: Transformers, vLLM, llama.cpp, Ollama, Text Generation Web UI.
Can be deployed as an OpenAI-compatible API server.
Performance benchmarks provided on C-Eval, CMMLU, GAOKAO, MMLU, HellaSwag, GSM8K, HumanEval, and MT-Bench (including a Chinese version, MT-Bench-zh).

Maintenance & Community

Developed by HIT-SCIR (Harbin Institute of Technology).
Active development with releases of Huozi 1.0, 2.0, 3.0, and 3.5.
Includes a Chinese MT-Bench dataset and a ChatGPT research report.

Licensing & Compatibility

Source code licensed under Apache 2.0.
Model usage for commercial purposes requires contacting the licensor (jngao@ir.hit.edu.cn) for registration and written authorization.

Limitations & Caveats

Models may still generate factually incorrect or biased/harmful content; users should exercise caution and not disseminate harmful outputs.
Specific performance claims are based on particular evaluation methodologies and frameworks (e.g., OpenCompass commit hash 4c87e77).

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days