llm-finetuning  by modal-labs

LLM fine-tuning guide using Modal and Axolotl

created 1 year ago
613 stars

Top 54.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a guide and tooling for fine-tuning large language models (LLMs) like Llama, Mistral, and CodeLlama using the axolotl library on Modal's serverless GPU infrastructure. It targets developers and researchers aiming for efficient, scalable LLM fine-tuning without managing underlying hardware.

How It Works

The project leverages axolotl for its comprehensive LLM fine-tuning capabilities, including support for DeepSpeed ZeRO, LoRA adapters, and Flash Attention. Modal provides a serverless execution environment, abstracting away Docker image management and GPU provisioning. This allows users to scale training jobs across multiple GPUs and deploy inference endpoints easily.

Quick Start & Requirements

  • Install: pip install modal
  • Authentication: Requires Modal account and token (python3 -m modal setup), Hugging Face API token (as my-huggingface-secret), and optionally Weights & Biases credentials.
  • Data: Agree to terms for specific Hugging Face models (e.g., Llama 3).
  • Launch: modal run --detach src.train --config=config/mistral-memorize.yml --data=data/sqlqa.subsample.jsonl
  • Docs: Modal Docs, Axolotl Config

Highlighted Details

  • Integrates axolotl with Modal for serverless LLM fine-tuning.
  • Supports state-of-the-art optimizations: DeepSpeed ZeRO, LoRA, Flash Attention.
  • Enables easy multi-GPU training configuration via environment variables (e.g., GPU_CONFIG=a100-80gb:4).
  • Provides serverless inference deployment with modal deploy src.inference.

Maintenance & Community

  • Developed by Modal Labs.
  • Links to Modal documentation and community resources are available.

Licensing & Compatibility

  • The repository itself is likely under a permissive license (e.g., MIT, Apache 2.0), but it relies on axolotl and models from Hugging Face, which have their own licenses. Users must comply with the terms of service for Modal, Hugging Face models, and Weights & Biases.

Limitations & Caveats

  • Configuration is primarily managed through YAML files, differing from axolotl's CLI-centric approach.
  • CUDA Out of Memory (OOM) errors can occur if GPU resources are insufficient or batch sizes/sequence lengths are too high.
  • Training on very small datasets may lead to ZeroDivisionError.
Health Check
Last commit

2 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
28 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

InternEvo by InternLM

1.0%
402
Lightweight training framework for model pre-training
created 1 year ago
updated 1 week ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

xTuring by stochasticai

0.0%
3k
SDK for fine-tuning and customizing open-source LLMs
created 2 years ago
updated 10 months ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
13 more.

axolotl by axolotl-ai-cloud

0.6%
10k
CLI tool for streamlined post-training of AI models
created 2 years ago
updated 1 day ago
Feedback? Help us improve.