llama3-chinese by seanzhang-zhichen

Large language model for Chinese language tasks

Created 1 year ago

295 stars

Top 89.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

Llama3-Chinese is a large language model fine-tuned from Meta's Llama-3-8B base model, specifically for Chinese language understanding and generation. It targets researchers and developers working with Chinese NLP tasks, offering improved performance on Chinese conversational data.

How It Works

The model leverages DORA and LORA+ training techniques on a substantial dataset comprising 500k high-quality Chinese multi-turn SFT data, 100k English multi-turn SFT data, and 2k single-turn self-cognition data. This approach aims to enhance the model's proficiency in Chinese dialogue and self-awareness while building upon the robust foundation of Llama-3.

Quick Start & Requirements

Installation: Download model weights from HuggingFace or ModelScope. Merging LoRA weights is optional.
Prerequisites: Python, transformers, torch, git-lfs. GPU with sufficient VRAM is recommended for inference.
Resources: Requires downloading base model (Llama-3-8B) and fine-tuned weights.
Links: HuggingFace, ModelScope

Highlighted Details

Fine-tuned on 500k Chinese multi-turn SFT data.
Utilizes DORA + LORA+ training methods.
Offers merged and LoRA-only model weights.
Supports inference via transformers, CLI, and vLLM.

Maintenance & Community

Developed by seanzhang-zhichen.
Based on Meta's Llama-3.
Acknowledgement of hiyouga/LLaMA-Factory.

Licensing & Compatibility

Code License: Apache License 2.0 (commercial use permitted).
Model/Data License: Research purposes only. Commercial use of model weights and data is restricted. Requires attribution.

Limitations & Caveats

The model weights and data are strictly for research purposes and cannot be used commercially. Users must adhere to the licensing agreement and provide proper attribution.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days