Chinese-LlaMA2 by michael-wzhu

Chinese adaptation of Meta's LLaMA2

Created 2 years ago

742 stars

Top 46.8% on SourcePulse

Project Summary

This repository provides Chinese-adapted versions of Meta's Llama 2 large language model, addressing its limited native Chinese capabilities. It targets researchers and developers seeking to deploy or fine-tune Llama 2 for Chinese language tasks, offering improved conversational abilities and specialized domain models.

How It Works

The project offers two primary approaches: supervised fine-tuning (SFT) on existing Chinese instruction datasets and continued pre-training on large Chinese corpora. For SFT, it utilizes datasets like UltraChat and Chinese Alpaca, with options for both extending the Llama 2 vocabulary and using the original. Continued pre-training aims to imbue the model with deeper Chinese knowledge. Specialized models for medical and traditional Chinese medicine domains are also under development.

Quick Start & Requirements

Model Weights: Requires obtaining Llama 2 weights from Meta's official website. Download script: src/further_ft/download_checkpoints.py.
Fine-tuning: Refer to SFT-README.md.
Serving: Recommended deployment via vllm for a ~2.7x speedup. See vllm-serving-README.
Quantization: Code available, referencing ChatGLM's quantization methods.
Hardware: Training commands suggest multi-GPU usage (e.g., CUDA_VISIBLE_DEVICES="2,3").

Highlighted Details

Offers both extended and original vocabulary versions of Chinese-Llama 2.
Provides LoRA weights and fully merged models for ease of use.
Includes Gradio demo code for interactive deployment.
Developing specialized models for medical and Traditional Chinese Medicine domains.

Maintenance & Community

Developed by the Intelligent Knowledge Management and Service team at East China Normal University.
Updates are frequent, with model versions (e.g., Chinese-LlaMA2-chat-sft-v0.3) released regularly.
WeChat group available for technical exchange (QR code validity noted as July 23rd).

Licensing & Compatibility

Llama 2 is described as "fully open-source and commercially usable" by Meta. The project aims to adhere to these terms.

Limitations & Caveats

Models are described as having "preliminary" Chinese communication and task capabilities, with limited Chinese knowledge that is continuously being improved.
The impact of vocabulary extension on English AIGC capabilities is an open question.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days