Chinese Llama-2 enhances Llama-2 for Chinese language tasks
Top 68.1% on sourcepulse
This project enhances Llama-2's capabilities for the Chinese language, targeting researchers and developers working with Chinese NLP. It offers improved comprehension, generation, and translation by applying parameter-efficient fine-tuning (LoRA), full-parameter instruction fine-tuning, and continued pre-training.
How It Works
The project leverages three primary methods to adapt Llama-2 for Chinese: LoRA fine-tuning for parameter efficiency, full-parameter fine-tuning on Chinese instruction datasets (like BAAI/COIG) for deeper adaptation, and continued pre-training on large Chinese and English corpora to capture linguistic nuances. This multi-pronged approach aims to significantly boost Llama-2's performance on Chinese language tasks.
Quick Start & Requirements
git clone https://github.com/longyuewangdcu/chinese-llama-2.git
), navigate into the directory (cd chinese-llama-2
), and install dependencies (pip install -e ./transformers
, pip install -r requirements.txt
).bf16
support and potentially multi-node setups with NCCL. Flash Attention (v1.0.4) is recommended for full parameter fine-tuning to reduce memory usage.Highlighted Details
bf16
for efficiency.Maintenance & Community
The project is associated with researchers from the University of Macau and Monash University. Contributions are welcomed via issues and pull requests.
Licensing & Compatibility
The code is licensed under Apache 2.0. Model weights are available for use, but users should verify the specific license terms of the base Llama-2 model for commercial or closed-source applications.
Limitations & Caveats
The project is actively developing, with a "TODO" section mentioning continued pre-training and further fine-tuning releases. Availability of specific model checkpoints might be subject to external links.
1 year ago
1 day