llama3-Chinese-chat by CrazyBoyM

Chinese Llama3 fine-tunes for chat, tutorials, and deployment

Created 1 year ago

4,158 stars

Top 11.7% on SourcePulse

Project Summary

This repository provides fine-tuned versions of Llama 3 and Llama 3.1 models specifically for Chinese language tasks. It caters to researchers and developers looking to leverage or build upon Llama 3 for Chinese NLP applications, offering pre-trained weights, tutorials for training, inference, evaluation, and deployment.

How It Works

The project fine-tunes Llama 3 base models using large, high-quality Chinese conversational datasets. It employs various techniques including Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to enhance the models' performance in Chinese dialogue, instruction following, and specific tasks. The approach prioritizes leveraging the existing strong multilingual capabilities of Llama 3 and augmenting them with targeted Chinese data, rather than expanding the vocabulary.

Quick Start & Requirements

Installation: Primarily through Hugging Face transformers library, or via Ollama (ollama run shareai/llama3.1-dpo-zh).
Prerequisites: Python, transformers, torch, peft, bitsandbytes (for quantization). GPU recommended for inference (e.g., 10GB VRAM for 4-bit quantization, 24GB for FP16).
Resources: FP16 inference requires ~16GB VRAM; 4-bit quantization ~8GB VRAM. Training costs vary significantly based on method (QLoRA 4-bit: ~6GB VRAM for 7B models).
Links:
- Ollama: https://ollama.com/
- LMStudio: https://github.com/CrazyBoyM/llama3-Chinese-chat/blob/main/deploy/LMStudio/README.md
- vLLM: https://github.com/CrazyBoyM/llama3-Chinese-chat/tree/main/deploy/vLLM
- API Deployment: https://github.com/CrazyBoyM/llama3-Chinese-chat/tree/main/deploy/API

Highlighted Details

Offers multiple fine-tuned versions: SFT, DPO, and specialized models for coding, agents, and longer contexts.
Provides comprehensive tutorials for training, inference (local CPU/GPU, vLLM, LMStudio, Ollama), and deployment.
Includes a curated list of Chinese NLP datasets and training tools (Firefly, LLaMA-Factory, Unsloth).
Demonstrates methods for extending context length (e.g., 32K, 96K) with minimal performance degradation.

Maintenance & Community

Active development with frequent updates, including Llama 3.1 Chinese DPO versions.
Community engagement encouraged via GitHub Issues and QQ groups for data sharing and technical discussion.
Bilibili channel for video tutorials.

Licensing & Compatibility

Models are typically released under permissive licenses allowing commercial use, but specific model cards should be checked. The base Llama 3 license applies.

Limitations & Caveats

The project focuses on fine-tuning existing Llama 3 models; it does not modify the base model's architecture or vocabulary.
Performance claims for specific fine-tuned versions are based on community benchmarks and self-evaluation.
Some specialized models (e.g., NSFW, role-playing) are in development or experimental stages.

Health Check

Last Commit

5 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days