Chinese LLM for research, base and chat versions, 30B parameters
Top 14.4% on sourcepulse
YAYI 2 is a 30B parameter multilingual large language model developed by wenge-research, designed to advance the Chinese LLM ecosystem. It offers Base and Chat versions, trained on over 2 trillion tokens of high-quality, multilingual data, and fine-tuned with millions of instructions and RLHF for better alignment.
How It Works
YAYI 2 is a Transformer-based LLM. Its training corpus includes a significant portion of Chinese data, alongside other languages, processed through a rigorous data pipeline involving standardization, cleaning, deduplication, and toxicity filtering. The model utilizes a Byte-Pair Encoding (BPE) tokenizer trained on 500GB of multilingual data, with a vocabulary size of 81920, featuring digit splitting and manual addition of HTML identifiers for improved performance.
Quick Start & Requirements
conda create --name yayi_inference_env python=3.8
), activate it, and install dependencies (pip install transformers==4.33.1 torch==2.0.1 sentencepiece==0.1.99 accelerate==0.25.0
). Inference example provided uses transformers
and requires a CUDA-enabled GPU (e.g., A100/A800).deepspeed
, transformers
, accelerate
, flash-attn
, and triton
. Full parameter fine-tuning is recommended on 16x A100 (80G) or higher. LoRA fine-tuning is also supported.Highlighted Details
deepspeed
.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
1+ week