Chinese-LLaMA-Alpaca-2  by ymcui

Chinese LLaMA/Alpaca-2: LLMs with long context for Chinese language

created 2 years ago
7,169 stars

Top 7.3% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides Chinese-centric Large Language Models (LLMs) based on Meta's Llama-2, offering both foundational (LLaMA-2) and instruction-tuned (Alpaca-2) variants. It targets developers and researchers needing enhanced Chinese language understanding and generation capabilities, including support for extended context lengths up to 64K tokens.

How It Works

The models are built upon Llama-2, featuring an expanded Chinese vocabulary and incremental pre-training on large-scale Chinese datasets. Key innovations include the use of FlashAttention-2 for efficient training and context extension techniques like Position Interpolation (PI) and YaRN for achieving 16K and 64K context lengths. Instruction-tuned models are further refined with RLHF for better alignment with human preferences and values.

Quick Start & Requirements

  • Installation: Models are available via Hugging Face Transformers, with GGUF versions for llama.cpp.
  • Prerequisites: Python, PyTorch, Hugging Face libraries. Specific hardware requirements depend on model size (e.g., 7B models require ~13GB VRAM for FP16).
  • Resources: Full models range from 2.4GB (1.3B) to 24.7GB (13B). LoRA weights are significantly smaller.
  • Documentation: Detailed guides for pre-training, fine-tuning, and deployment are available on the project's GitHub Wiki.

Highlighted Details

  • Offers models with standard 4K context and extended 16K/64K context lengths.
  • Includes RLHF-tuned variants for improved value alignment.
  • Supports integration with popular tools like transformers, llama.cpp, text-generation-webui, and LangChain.
  • Provides extensive benchmarks on C-Eval, CMMLU, and LongBench, showcasing performance across various tasks.

Maintenance & Community

The project is actively maintained, with recent updates including support for Llama-3 based models (Chinese-LLaMA-Alpaca-3). Community interaction is encouraged via GitHub Issues and Discussions.

Licensing & Compatibility

The models are based on Llama-2, which has a permissive license allowing commercial use. Users must adhere to the Llama-2 license terms.

Limitations & Caveats

The models may generate unpredictable or undesirable content, and their training is not fully comprehensive due to computational and data constraints, requiring further improvement in Chinese understanding. No interactive online demo is provided, necessitating local deployment for testing.

Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
47 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
3 more.

LLaMA-Adapter by OpenGVLab

0.0%
6k
Efficient fine-tuning for instruction-following LLaMA models
created 2 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.