llama3-chinese  by seanzhang-zhichen

Large language model for Chinese language tasks

created 1 year ago
294 stars

Top 90.9% on sourcepulse

GitHubView on GitHub
Project Summary

Llama3-Chinese is a large language model fine-tuned from Meta's Llama-3-8B base model, specifically for Chinese language understanding and generation. It targets researchers and developers working with Chinese NLP tasks, offering improved performance on Chinese conversational data.

How It Works

The model leverages DORA and LORA+ training techniques on a substantial dataset comprising 500k high-quality Chinese multi-turn SFT data, 100k English multi-turn SFT data, and 2k single-turn self-cognition data. This approach aims to enhance the model's proficiency in Chinese dialogue and self-awareness while building upon the robust foundation of Llama-3.

Quick Start & Requirements

  • Installation: Download model weights from HuggingFace or ModelScope. Merging LoRA weights is optional.
  • Prerequisites: Python, transformers, torch, git-lfs. GPU with sufficient VRAM is recommended for inference.
  • Resources: Requires downloading base model (Llama-3-8B) and fine-tuned weights.
  • Links: HuggingFace, ModelScope

Highlighted Details

  • Fine-tuned on 500k Chinese multi-turn SFT data.
  • Utilizes DORA + LORA+ training methods.
  • Offers merged and LoRA-only model weights.
  • Supports inference via transformers, CLI, and vLLM.

Maintenance & Community

  • Developed by seanzhang-zhichen.
  • Based on Meta's Llama-3.
  • Acknowledgement of hiyouga/LLaMA-Factory.

Licensing & Compatibility

  • Code License: Apache License 2.0 (commercial use permitted).
  • Model/Data License: Research purposes only. Commercial use of model weights and data is restricted. Requires attribution.

Limitations & Caveats

The model weights and data are strictly for research purposes and cannot be used commercially. Users must adhere to the licensing agreement and provide proper attribution.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
3 more.

LLaMA-Adapter by OpenGVLab

0.0%
6k
Efficient fine-tuning for instruction-following LLaMA models
created 2 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.