Chinese-LlaMA2  by michael-wzhu

Chinese adaptation of Meta's LLaMA2

created 2 years ago
742 stars

Top 47.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides Chinese-adapted versions of Meta's Llama 2 large language model, addressing its limited native Chinese capabilities. It targets researchers and developers seeking to deploy or fine-tune Llama 2 for Chinese language tasks, offering improved conversational abilities and specialized domain models.

How It Works

The project offers two primary approaches: supervised fine-tuning (SFT) on existing Chinese instruction datasets and continued pre-training on large Chinese corpora. For SFT, it utilizes datasets like UltraChat and Chinese Alpaca, with options for both extending the Llama 2 vocabulary and using the original. Continued pre-training aims to imbue the model with deeper Chinese knowledge. Specialized models for medical and traditional Chinese medicine domains are also under development.

Quick Start & Requirements

  • Model Weights: Requires obtaining Llama 2 weights from Meta's official website. Download script: src/further_ft/download_checkpoints.py.
  • Fine-tuning: Refer to SFT-README.md.
  • Serving: Recommended deployment via vllm for a ~2.7x speedup. See vllm-serving-README.
  • Quantization: Code available, referencing ChatGLM's quantization methods.
  • Hardware: Training commands suggest multi-GPU usage (e.g., CUDA_VISIBLE_DEVICES="2,3").

Highlighted Details

  • Offers both extended and original vocabulary versions of Chinese-Llama 2.
  • Provides LoRA weights and fully merged models for ease of use.
  • Includes Gradio demo code for interactive deployment.
  • Developing specialized models for medical and Traditional Chinese Medicine domains.

Maintenance & Community

  • Developed by the Intelligent Knowledge Management and Service team at East China Normal University.
  • Updates are frequent, with model versions (e.g., Chinese-LlaMA2-chat-sft-v0.3) released regularly.
  • WeChat group available for technical exchange (QR code validity noted as July 23rd).

Licensing & Compatibility

  • Llama 2 is described as "fully open-source and commercially usable" by Meta. The project aims to adhere to these terms.

Limitations & Caveats

  • Models are described as having "preliminary" Chinese communication and task capabilities, with limited Chinese knowledge that is continuously being improved.
  • The impact of vocabulary extension on English AIGC capabilities is an open question.
Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.