unsloth-studio  by unslothai

Accelerate LLM finetuning with reduced memory usage

Created 1 year ago
374 stars

Top 75.8% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Summary

Unsloth Studio accelerates LLM finetuning and inference, drastically reducing memory requirements. It empowers researchers and engineers to iterate faster and deploy larger models on less demanding hardware by optimizing LLM training.

How It Works

The core innovation lies in custom Triton kernels and manual backpropagation, enabling exact finetuning with zero accuracy loss. Unsloth leverages optimized 4-bit quantization (QLoRA/LoRA) and specialized gradient checkpointing for significant VRAM reduction and speedups, outperforming standard Hugging Face implementations.

Quick Start & Requirements

  • Install: pip install unsloth (Linux recommended). Advanced installation for Windows and specific CUDA/PyTorch versions is detailed.
  • Prerequisites: NVIDIA GPUs (CUDA Capability >= 7.0), Python 3.10-3.12, compatible PyTorch and CUDA Toolkit, NVIDIA drivers. Windows requires Visual Studio with C++ development tools.
  • Links: Official Documentation, Hugging Face Docs, Kaggle Notebooks, Blog.

Highlighted Details

  • Achieves up to 2x faster finetuning and inference speeds.
  • Reduces memory usage by up to 80% compared to standard methods.
  • Supports a broad range of LLMs including Llama 3.x, Mistral, Phi-4, Gemma 2, and Qwen 2.5.
  • Enables significantly extended context windows (e.g., 342K for Llama 3.1 (8B), 89K for Llama 3.3 (70B)).
  • Features Dynamic 4-bit Quantization for enhanced accuracy with minimal VRAM overhead.
  • Models can be exported to GGUF, Ollama, and vLLM formats.
  • Full support for Reinforcement Learning techniques like DPO, GRPO, and PPO.

Maintenance & Community

Developed by Daniel Han, Michael Han, and the Unsloth team, with contributions acknowledged from various individuals and thanks to Hugging Face's TRL library. Community engagement is encouraged via Twitter (X) and Reddit.

Licensing & Compatibility

The specific open-source license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is therefore undetermined without license clarification.

Limitations & Caveats

Python 3.13 is not supported. Windows installation is complex, requiring manual setup of several dependencies and potential workarounds. Older GPUs (e.g., GTX 1070/1080) are functional but significantly slower. The absence of a stated license poses a potential adoption blocker for commercial applications.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
9
Star History
245 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

rtp-llm by alibaba

0.3%
1k
LLM inference engine for diverse applications
Created 2 years ago
Updated 21 hours ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
41 more.

unsloth by unslothai

2.6%
61k
Finetuning tool for LLMs, targeting speed and memory efficiency
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.