LLM-Tuning by beyondguo

SDK for LLM tuning and Sample Design Engineering (SDE)

Created 2 years ago

1,019 stars

Top 36.7% on SourcePulse

Project Summary

This repository provides tools and tutorials for efficiently fine-tuning Large Language Models (LLMs) using Sample Design Engineering (SDE). It targets developers and researchers aiming to improve LLM performance on downstream tasks with minimal data and computational resources. The core contribution is the SDE methodology, which empirically identifies effective sample design strategies for fine-tuning.

How It Works

The project introduces Sample Design Engineering (SDE) as a systematic approach to optimize fine-tuning datasets. It explores various sample design strategies, uncovering patterns consistent across different LLMs. The ES-SDE approach integrates the most effective options, demonstrating superiority over baseline methods in empirical studies. This method focuses on the quality and structure of training samples rather than solely on model architecture or training algorithms.

Quick Start & Requirements

Install: pip install transformers datasets accelerate sentencepiece tensorboard peft
Prerequisites: Python 3.9.16, PyTorch 2.0.1, transformers 4.29.1, datasets 2.12.0, accelerate 0.19.0, peft 0.3.0, sentencepiece 0.1.99, tensorboard 2.13.0. CUDA is recommended for training.
Setup: Tokenization via tokenize.sh, followed by training via train.sh. Specific Python scripts are used for different models (e.g., chatglm_lora_tuning.py, baichuan_lora_tuning.py).
Docs: LLM-Tuning

Highlighted Details

Supports LoRA fine-tuning for multiple LLMs including LLaMA2, Qwen1.5, Chinese-LLaMA-Alpaca, InternLM-7B, Baichuan-7B/2, ChatGLM2-6B, and ChatGLM-6B.
Includes a tutorial for Reinforcement Learning from Human Feedback (RLHF) based on LoRA for Baichuan models.
Demonstrates a two-line code approach to start LoRA training after dataset tokenization.
Provides clear instructions for loading and applying LoRA weights for inference.

Maintenance & Community

The project acknowledges contributions from Hugging Face's peft library and references the ChatGLM-Tuning and LLaMA-Efficient-Tuning projects.
Offers a WeChat discussion group for community support.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README mentions that loading multiple LoRA models simultaneously for mixed capabilities is not well-supported, potentially leading to overwriting or forgetting effects.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days