Chinese Llama-3, instruction-tuned LLMs
Top 23.1% on sourcepulse
This project provides Chinese-adapted versions of Meta's Llama-3 models, including base and instruction-tuned variants. It aims to enhance Chinese language understanding and instruction following capabilities for researchers and developers working with large language models in Chinese.
How It Works
The project builds upon Meta's Llama-3 architecture, leveraging its 8K context window and Grouped Query Attention (GQA). Instead of expanding the vocabulary, it utilizes Llama-3's native 128K BPE tokenizer, finding its encoding efficiency comparable to previous Chinese-specific tokenizers. The instruction-tuned models are fine-tuned on curated Chinese instruction datasets, including Alpaca-style, STEM, and Q&A data, with recent versions showing significant improvements in benchmarks.
Quick Start & Requirements
transformers
, llama.cpp
, text-generation-webui
, vLLM
, and Ollama
.llama.cpp
supports CPU/GPU, while transformers
and vLLM
typically require GPUs. Specific CUDA versions are not mandated but are generally recommended for GPU acceleration.Highlighted Details
llama.cpp
and compatible tools.Maintenance & Community
The project is actively maintained, with regular updates to instruction-tuned models (v1, v2, v3). Community interaction is encouraged via GitHub Issues and Discussions.
Licensing & Compatibility
The models are developed based on Meta's Llama-3, and users must adhere to the Llama-3 license agreement. The project itself does not impose additional restrictions beyond those of the base Llama-3 model.
Limitations & Caveats
The project's models are fine-tuned versions of Llama-3 and inherit its base capabilities and limitations. Performance on specific downstream tasks may vary, and users are encouraged to test models on their target applications. The README notes that the instruction models may sometimes respond as ChatGPT, indicating potential fine-tuning artifacts.
10 months ago
1 day