unsloth-zoo by unslothai

Accelerate LLM finetuning with reduced VRAM usage

Created 1 year ago

287 stars

Top 91.3% on SourcePulse

View on GitHub

2 Experts Love This Project

Project Summary

Unsloth Zoo provides optimized utilities for finetuning large language models, significantly reducing training time and VRAM requirements. It targets engineers and researchers needing to efficiently adapt LLMs for various tasks, enabling finetuning on more accessible hardware and accelerating development cycles.

How It Works

Unsloth employs custom kernels written in OpenAI's Triton language and a manual backpropagation engine. This approach allows for highly optimized computations, achieving substantial speedups and memory reductions without sacrificing model accuracy. It integrates advanced techniques like dynamic 4-bit quantization and optimized LoRA implementations to maximize efficiency.

Quick Start & Requirements

Installation is straightforward via pip: pip install unsloth for Linux/WSL. Windows users require PyTorch pre-installation. An official Docker image (unsloth/unsloth) is also available. Requires NVIDIA GPUs with CUDA Capability 7.0+ (e.g., RTX 20-series and newer, A100, H100). Python 3.10-3.14 is supported. Detailed installation guides and documentation are available.

Highlighted Details

Supports full-finetuning, pretraining, and 4/8/16-bit training across a vast array of models including Llama (3.3, 3.2, 3.1), Gemma, Mistral, Phi, Qwen, and multimodal/TTS models.
Achieves up to 2.2x faster training and over 80% VRAM reduction compared to standard Hugging Face implementations, with 0% loss in accuracy.
Enables dramatically longer context windows, e.g., 342K for Llama 3.1 (8B) and 89K for Llama 3.3 (70B) on high-end GPUs.
Integrates seamlessly with Hugging Face's TRL library for Reinforcement Learning tasks like DPO and GRPO.

Maintenance & Community

The project shows active development with frequent updates on new model support, optimizations, and features. Community engagement is fostered through Twitter (X) and Reddit. Notable collaborations include work with Apple on specific optimizations.

Licensing & Compatibility

The repository's README does not explicitly state a software license. This absence creates ambiguity regarding usage rights, particularly for commercial applications or integration into closed-source projects. Compatibility is primarily for NVIDIA GPUs.

Limitations & Caveats

Windows installation can be complex, requiring careful setup of PyTorch, CUDA, and Triton. Older NVIDIA GPUs (e.g., GTX 10-series) are supported but may offer limited performance. The most significant caveat is the lack of a clear license, which poses a risk for adoption in production or commercial environments.

Health Check

Last Commit

18 hours ago

Responsiveness

Inactive

Pull Requests (30d)

128

Issues (30d)

Star History

15 stars in the last 30 days