FineTuningLLMs  by dvgodoy

Hands-on guide to fine-tuning LLMs

Created 1 year ago
546 stars

Top 58.4% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides practical guidance and code examples for fine-tuning Large Language Models (LLMs) using PyTorch and the Hugging Face ecosystem. It targets data scientists and engineers seeking to adapt LLMs for specific tasks, focusing on key techniques like quantization and Low-Rank Adaptation (LoRA) for efficient single-GPU fine-tuning.

How It Works

The project demystifies LLM fine-tuning by breaking down complex concepts into manageable steps, mirroring the structure of a comprehensive book. It emphasizes practical implementation using Hugging Face's transformers and peft libraries, demonstrating techniques such as 8-bit and 4-bit quantization with bitsandbytes, and LoRA for parameter-efficient adaptation. The approach prioritizes efficient training on consumer-grade GPUs, addressing the common challenge of limited hardware resources.

Quick Start & Requirements

Notebooks can be run directly from GitHub via Google Colab, requiring a Google account and GPU access. Key dependencies include PyTorch, Hugging Face transformers, peft, and bitsandbytes. Detailed setup instructions and troubleshooting are available in the book's appendices and FAQ.

Highlighted Details

  • Covers quantization (8-bit, 4-bit) and Low-Rank Adaptation (LoRA).
  • Demonstrates fine-tuning with Hugging Face's SFTTrainer.
  • Explores attention mechanisms like Flash Attention and SDPA.
  • Includes local deployment methods using GGUF, Ollama, and llama.cpp.

Maintenance & Community

This repository is the official companion to a published book, indicating a stable and well-documented resource. Further community interaction or support channels are not explicitly mentioned in the README.

Licensing & Compatibility

The repository's licensing is not specified in the provided README. Compatibility for commercial use or closed-source linking would depend on the specific license chosen for the code and any underlying libraries.

Limitations & Caveats

The content is geared towards an intermediate audience with prior knowledge of deep learning fundamentals, Transformers, and PyTorch. While focused on single-GPU fine-tuning, advanced users might find the scope limited for distributed training scenarios.

Health Check
Last Commit

4 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
24 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
34 more.

flash-attention by Dao-AILab

0.6%
20k
Fast, memory-efficient attention implementation
Created 3 years ago
Updated 23 hours ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
40 more.

unsloth by unslothai

0.6%
48k
Finetuning tool for LLMs, targeting speed and memory efficiency
Created 1 year ago
Updated 10 hours ago
Feedback? Help us improve.