Chinese-LLaMA-Alpaca-3  by ymcui

Chinese Llama-3, instruction-tuned LLMs

created 1 year ago
1,928 stars

Top 23.1% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides Chinese-adapted versions of Meta's Llama-3 models, including base and instruction-tuned variants. It aims to enhance Chinese language understanding and instruction following capabilities for researchers and developers working with large language models in Chinese.

How It Works

The project builds upon Meta's Llama-3 architecture, leveraging its 8K context window and Grouped Query Attention (GQA). Instead of expanding the vocabulary, it utilizes Llama-3's native 128K BPE tokenizer, finding its encoding efficiency comparable to previous Chinese-specific tokenizers. The instruction-tuned models are fine-tuned on curated Chinese instruction datasets, including Alpaca-style, STEM, and Q&A data, with recent versions showing significant improvements in benchmarks.

Quick Start & Requirements

  • Installation: Models are available via Hugging Face, ModelScope, and wisemodel. Deployment is supported through transformers, llama.cpp, text-generation-webui, vLLM, and Ollama.
  • Prerequisites: Varies by deployment method; llama.cpp supports CPU/GPU, while transformers and vLLM typically require GPUs. Specific CUDA versions are not mandated but are generally recommended for GPU acceleration.
  • Resources: Full models require significant VRAM (e.g., 8B models typically need >16GB VRAM for full precision). Quantized versions (GGUF) significantly reduce VRAM requirements for CPU/GPU inference.
  • Links: Hugging Face, ModelScope, Online Demo

Highlighted Details

  • Performance: Instruct-v3 model achieves 83.6% win rate on the LLM Arena and scores 55.2/54.8 on C-Eval (valid/test), outperforming previous versions and Meta's Llama-3-8B-Instruct on Chinese benchmarks.
  • Long Context: Supports an 8K context window natively, with potential for further extension using methods like YaRN.
  • Training Scripts: Open-sourced pre-training and instruction fine-tuning scripts allow for further customization and development.
  • Quantization: Offers GGUF quantized models for efficient local deployment via llama.cpp and compatible tools.

Maintenance & Community

The project is actively maintained, with regular updates to instruction-tuned models (v1, v2, v3). Community interaction is encouraged via GitHub Issues and Discussions.

Licensing & Compatibility

The models are developed based on Meta's Llama-3, and users must adhere to the Llama-3 license agreement. The project itself does not impose additional restrictions beyond those of the base Llama-3 model.

Limitations & Caveats

The project's models are fine-tuned versions of Llama-3 and inherit its base capabilities and limitations. Performance on specific downstream tasks may vary, and users are encouraged to test models on their target applications. The README notes that the instruction models may sometimes respond as ChatGPT, indicating potential fine-tuning artifacts.

Health Check
Last commit

10 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
34 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
3 more.

LLaMA-Adapter by OpenGVLab

0.0%
6k
Efficient fine-tuning for instruction-following LLaMA models
created 2 years ago
updated 1 year ago
Feedback? Help us improve.