axolotl  by axolotl-ai-cloud

CLI tool for streamlined post-training of AI models

created 2 years ago
10,074 stars

Top 5.1% on sourcepulse

GitHubView on GitHub
Project Summary

Axolotl is a comprehensive toolkit for streamlining the post-training of large language models, targeting AI researchers and engineers. It simplifies complex fine-tuning tasks like LoRA, QLoRA, and full fine-tuning, enabling efficient customization of pre-trained models.

How It Works

Axolotl leverages a YAML configuration system to manage the entire model training pipeline, from data preprocessing to inference and evaluation. This approach offers a unified and reproducible workflow. It integrates with performance-enhancing libraries such as xformers, Flash Attention, and various multi-GPU strategies (FSDP, DeepSpeed), aiming to maximize training speed and efficiency.

Quick Start & Requirements

  • Installation: pip install --no-build-isolation axolotl[flash-attn,deepspeed]
  • Prerequisites: NVIDIA GPU (Ampere+ recommended for bf16/Flash Attention), Python 3.11, PyTorch ≥2.4.1.
  • Examples: Fetch example configurations via axolotl fetch examples.
  • First Fine-tune: Run axolotl train examples/llama-3/lora-1b.yml.
  • Documentation: Getting Started Guide

Highlighted Details

  • Supports a wide range of Hugging Face models including LLaMA, Mistral, Mixtral, Falcon, and Pythia.
  • Offers multiple training methods: full fine-tuning, LoRA, QLoRA, ReLoRA, and GPTQ.
  • Features performance optimizations like Flash Attention, xformers, and multi-packing.
  • Integrates with logging platforms like Weights & Biases, MLflow, and Comet.

Maintenance & Community

The project is sponsored by Modal. Community support is available via Discord.

Licensing & Compatibility

Licensed under the Apache 2.0 License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

Support for certain models (e.g., Mixtral-MoE, Falcon, Pythia) with specific optimizations like GPTQ or Flash Attention is marked as untested or not supported in the provided table.

Health Check
Last commit

21 hours ago

Responsiveness

1 day

Pull Requests (30d)
123
Issues (30d)
17
Star History
878 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

InternEvo by InternLM

1.0%
402
Lightweight training framework for model pre-training
created 1 year ago
updated 1 week ago
Feedback? Help us improve.