LLM-Tuning  by beyondguo

SDK for LLM tuning and Sample Design Engineering (SDE)

created 2 years ago
1,009 stars

Top 37.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides tools and tutorials for efficiently fine-tuning Large Language Models (LLMs) using Sample Design Engineering (SDE). It targets developers and researchers aiming to improve LLM performance on downstream tasks with minimal data and computational resources. The core contribution is the SDE methodology, which empirically identifies effective sample design strategies for fine-tuning.

How It Works

The project introduces Sample Design Engineering (SDE) as a systematic approach to optimize fine-tuning datasets. It explores various sample design strategies, uncovering patterns consistent across different LLMs. The ES-SDE approach integrates the most effective options, demonstrating superiority over baseline methods in empirical studies. This method focuses on the quality and structure of training samples rather than solely on model architecture or training algorithms.

Quick Start & Requirements

  • Install: pip install transformers datasets accelerate sentencepiece tensorboard peft
  • Prerequisites: Python 3.9.16, PyTorch 2.0.1, transformers 4.29.1, datasets 2.12.0, accelerate 0.19.0, peft 0.3.0, sentencepiece 0.1.99, tensorboard 2.13.0. CUDA is recommended for training.
  • Setup: Tokenization via tokenize.sh, followed by training via train.sh. Specific Python scripts are used for different models (e.g., chatglm_lora_tuning.py, baichuan_lora_tuning.py).
  • Docs: LLM-Tuning

Highlighted Details

  • Supports LoRA fine-tuning for multiple LLMs including LLaMA2, Qwen1.5, Chinese-LLaMA-Alpaca, InternLM-7B, Baichuan-7B/2, ChatGLM2-6B, and ChatGLM-6B.
  • Includes a tutorial for Reinforcement Learning from Human Feedback (RLHF) based on LoRA for Baichuan models.
  • Demonstrates a two-line code approach to start LoRA training after dataset tokenization.
  • Provides clear instructions for loading and applying LoRA weights for inference.

Maintenance & Community

  • The project acknowledges contributions from Hugging Face's peft library and references the ChatGLM-Tuning and LLaMA-Efficient-Tuning projects.
  • Offers a WeChat discussion group for community support.

Licensing & Compatibility

  • The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

  • The README mentions that loading multiple LoRA models simultaneously for mixed capabilities is not well-supported, potentially leading to overwriting or forgetting effects.
Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
17 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.