Large language model for Chinese language tasks
Top 90.9% on sourcepulse
Llama3-Chinese is a large language model fine-tuned from Meta's Llama-3-8B base model, specifically for Chinese language understanding and generation. It targets researchers and developers working with Chinese NLP tasks, offering improved performance on Chinese conversational data.
How It Works
The model leverages DORA and LORA+ training techniques on a substantial dataset comprising 500k high-quality Chinese multi-turn SFT data, 100k English multi-turn SFT data, and 2k single-turn self-cognition data. This approach aims to enhance the model's proficiency in Chinese dialogue and self-awareness while building upon the robust foundation of Llama-3.
Quick Start & Requirements
transformers
, torch
, git-lfs
. GPU with sufficient VRAM is recommended for inference.Highlighted Details
transformers
, CLI, and vLLM.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The model weights and data are strictly for research purposes and cannot be used commercially. Users must adhere to the licensing agreement and provide proper attribution.
1 year ago
1 day