Chinese LLaMA/Alpaca-2: LLMs with long context for Chinese language
Top 7.3% on sourcepulse
This project provides Chinese-centric Large Language Models (LLMs) based on Meta's Llama-2, offering both foundational (LLaMA-2) and instruction-tuned (Alpaca-2) variants. It targets developers and researchers needing enhanced Chinese language understanding and generation capabilities, including support for extended context lengths up to 64K tokens.
How It Works
The models are built upon Llama-2, featuring an expanded Chinese vocabulary and incremental pre-training on large-scale Chinese datasets. Key innovations include the use of FlashAttention-2 for efficient training and context extension techniques like Position Interpolation (PI) and YaRN for achieving 16K and 64K context lengths. Instruction-tuned models are further refined with RLHF for better alignment with human preferences and values.
Quick Start & Requirements
Highlighted Details
transformers
, llama.cpp
, text-generation-webui
, and LangChain
.Maintenance & Community
The project is actively maintained, with recent updates including support for Llama-3 based models (Chinese-LLaMA-Alpaca-3). Community interaction is encouraged via GitHub Issues and Discussions.
Licensing & Compatibility
The models are based on Llama-2, which has a permissive license allowing commercial use. Users must adhere to the Llama-2 license terms.
Limitations & Caveats
The models may generate unpredictable or undesirable content, and their training is not fully comprehensive due to computational and data constraints, requiring further improvement in Chinese understanding. No interactive online demo is provided, necessitating local deployment for testing.
2 weeks ago
1 day