Chinese-LLaMA-Alpaca  by ymcui

Chinese LLaMA & Alpaca: LLMs for Chinese NLP research

created 2 years ago
18,889 stars

Top 2.4% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides open-source Chinese LLaMA and Alpaca large language models, enhancing Chinese NLP research. It addresses the need for improved Chinese language understanding and instruction-following capabilities by expanding the vocabulary of original LLaMA models with Chinese data and fine-tuning with Chinese instruction datasets. The models are suitable for researchers and developers working with Chinese NLP tasks.

How It Works

The project extends the original LLaMA models by incorporating a larger Chinese vocabulary for more efficient encoding and decoding. Chinese LLaMA models are pre-trained on extensive Chinese text data, while Chinese Alpaca models undergo instruction fine-tuning using Chinese instruction datasets. This approach significantly boosts the models' ability to understand and execute instructions, making them comparable to models like ChatGPT for Chinese language tasks.

Quick Start & Requirements

  • Installation: Models are provided as LoRA weights and require merging with original LLaMA models. Instructions for merging and deployment are available in the project's wiki.
  • Prerequisites: Requires original LLaMA model weights (application needed), Python, and potentially CUDA for GPU acceleration.
  • Resources: Merged models can be quantized to 4-bit for local CPU/GPU deployment, with sizes ranging from 3.9 GB (7B) to 17.2 GB (33B).
  • Links: Model Downloads, Merging Models, Local Deployment

Highlighted Details

  • Offers multiple model versions (7B, 13B, 33B) including base (LLaMA) and instruction-tuned (Alpaca) variants, with "Pro" versions addressing short reply issues.
  • Supports integration with popular frameworks like 🤗transformers, llama.cpp, text-generation-webui, LangChain, and privateGPT.
  • Provides training scripts for pre-training and instruction fine-tuning, allowing users to further train models.
  • Achieved competitive results on the C-Eval benchmark for Chinese language understanding.

Maintenance & Community

  • The project is actively maintained, with recent releases including Llama-3 based models (Chinese-LLaMA-3-8B, Llama-3-Chinese-8B-Instruct).
  • Community discussions are available via GitHub Issues and Discussions.

Licensing & Compatibility

  • The project states that original LLaMA models are prohibited from commercial use. The provided LoRA weights are for academic research only and cannot be used for commercial purposes.

Limitations & Caveats

  • Models may generate unpredictable, harmful, or biased content.
  • Training is not fully comprehensive due to compute and data limitations, requiring further improvement in Chinese understanding.
  • No interactive online demo is currently available; local deployment is necessary.
Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
150 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.