LLM-Dojo  by mst272

LLM training framework for model training and RLHF

created 1 year ago
814 stars

Top 44.4% on sourcepulse

GitHubView on GitHub
Project Summary

LLM-Dojo is an open-source framework designed for learning and experimenting with large language models (LLMs) and vision-language models (VLMs). It offers a clean, readable codebase for training, including Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methods like DPO, CPO, and KTO. The project targets researchers and developers looking for a flexible platform to customize and test various LLM training techniques.

How It Works

LLM-Dojo is built upon the Hugging Face ecosystem, providing modular components for SFT, VLM, and RLHF. Its SFT framework supports various training strategies like LoRA, QLoRA, and full parameter tuning, with automatic chat template adaptation. The RLHF module integrates multiple reinforcement learning algorithms and supports efficient training on a single A100 GPU using DeepSpeed and LoRA. The VLM module enables multimodal training for tasks like Visual Question Answering.

Quick Start & Requirements

  • Installation/Execution: Primarily uses deepspeed for multi-GPU training or standard Python commands for single-GPU training. Example commands are provided in run_example.sh.
  • Prerequisites: Python, Hugging Face libraries, DeepSpeed. GPU with CUDA is recommended for efficient training.
  • Resources: Claims a single A100 GPU is sufficient for RLHF training. LoRA with Qwen (7B) uses ~16GB VRAM with DeepSpeed Zero3.
  • Documentation: Detailed explanations for RLHF, SFT, and VLM are available within their respective directories. Technical tricks and explanations are in the llm_tricks folder.

Highlighted Details

  • Supports a wide range of LLMs including Qwen, Llama, Yi, Gemma, and Phi-3.
  • RLHF framework includes Knowledge Distillation, DPO, CPO, SimPO, KTO, and Rejected Sampling.
  • VLM module supports Qwen-2-VL and Llava for Visual Question Answering.
  • Includes a llm_tricks section for implementing and explaining advanced LLM techniques from scratch.

Maintenance & Community

The project actively lists recent updates and new feature implementations, indicating ongoing development. Contributions via Issues and Pull Requests are encouraged. Links to related technical articles on Zhihu and Medium are provided for deeper understanding.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is under active development, with some features like Rejected Sampling documentation still pending. While it aims for ease of use and modification, the rapid addition of new techniques might lead to occasional breaking changes or incomplete documentation for the latest additions.

Health Check
Last commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
115 stars in the last 90 days

Explore Similar Projects

Starred by Travis Fischer Travis Fischer(Founder of Agentic), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
9 more.

LLaVA by haotian-liu

0.2%
23k
Multimodal assistant with GPT-4 level capabilities
created 2 years ago
updated 11 months ago
Feedback? Help us improve.