LLM-Dojo  by mst272

LLM training framework for model training and RLHF

Created 1 year ago
892 stars

Top 40.6% on SourcePulse

GitHubView on GitHub
Project Summary

LLM-Dojo is an open-source framework designed for learning and experimenting with large language models (LLMs) and vision-language models (VLMs). It offers a clean, readable codebase for training, including Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methods like DPO, CPO, and KTO. The project targets researchers and developers looking for a flexible platform to customize and test various LLM training techniques.

How It Works

LLM-Dojo is built upon the Hugging Face ecosystem, providing modular components for SFT, VLM, and RLHF. Its SFT framework supports various training strategies like LoRA, QLoRA, and full parameter tuning, with automatic chat template adaptation. The RLHF module integrates multiple reinforcement learning algorithms and supports efficient training on a single A100 GPU using DeepSpeed and LoRA. The VLM module enables multimodal training for tasks like Visual Question Answering.

Quick Start & Requirements

  • Installation/Execution: Primarily uses deepspeed for multi-GPU training or standard Python commands for single-GPU training. Example commands are provided in run_example.sh.
  • Prerequisites: Python, Hugging Face libraries, DeepSpeed. GPU with CUDA is recommended for efficient training.
  • Resources: Claims a single A100 GPU is sufficient for RLHF training. LoRA with Qwen (7B) uses ~16GB VRAM with DeepSpeed Zero3.
  • Documentation: Detailed explanations for RLHF, SFT, and VLM are available within their respective directories. Technical tricks and explanations are in the llm_tricks folder.

Highlighted Details

  • Supports a wide range of LLMs including Qwen, Llama, Yi, Gemma, and Phi-3.
  • RLHF framework includes Knowledge Distillation, DPO, CPO, SimPO, KTO, and Rejected Sampling.
  • VLM module supports Qwen-2-VL and Llava for Visual Question Answering.
  • Includes a llm_tricks section for implementing and explaining advanced LLM techniques from scratch.

Maintenance & Community

The project actively lists recent updates and new feature implementations, indicating ongoing development. Contributions via Issues and Pull Requests are encouraged. Links to related technical articles on Zhihu and Medium are provided for deeper understanding.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is under active development, with some features like Rejected Sampling documentation still pending. While it aims for ease of use and modification, the rapid addition of new techniques might lead to occasional breaking changes or incomplete documentation for the latest additions.

Health Check
Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
15 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), and
2 more.

tunix by google

1.4%
2k
JAX-native library for efficient LLM post-training
Created 7 months ago
Updated 19 hours ago
Starred by Jeff Huber Jeff Huber(Cofounder of Chroma), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
31 more.

trl by huggingface

0.6%
16k
Library for transformer RL
Created 5 years ago
Updated 1 day ago
Feedback? Help us improve.