xlora  by EricLBuehler

Mixture of LoRA Experts for efficient LLM adaptation

Created 2 years ago
257 stars

Top 98.3% on SourcePulse

GitHubView on GitHub
Project Summary

X-LoRA introduces a Mixture of Experts (MoE) approach to efficiently fine-tune large language models by dynamically combining multiple LoRA adapters. It targets researchers and practitioners seeking flexible, parameter-efficient adaptation of LLMs for complex tasks. The primary benefit is enabling the reuse and sophisticated mixing of existing fine-tuned models without retraining the base LLM, leading to significant computational savings.

How It Works

The framework learns specific scaling values that act as gates for individual LoRA experts. These learned scalings are applied in a dense fashion, allowing multiple experts to contribute to the model's output token-by-token. A key design choice is freezing both the base LLM and all LoRA adapters, with only the gating mechanism being trainable. This drastically reduces the parameter count required for fine-tuning, enabling efficient adaptation and a hierarchical, encapsulated strategy for complex task decomposition.

Quick Start & Requirements

Installation is available via pip: pip install git+https://github.com/EricLBuehler/xlora.git. Practical usage requires a CUDA-enabled GPU and standard deep learning libraries (PyTorch, HuggingFace Transformers). Examples demonstrate integration with models like mistralai/Mistral-7B-Instruct-v0.1.

Highlighted Details

  • Provides an easy-to-use API (add_xlora_to_model, from_pretrained) for seamless integration with HuggingFace Transformers models.
  • Supports efficient inference through frameworks like Mistral.rs.
  • Enables dynamic, layer-wise mixing of LoRA adapters for sophisticated task composition.
  • Offers features for logging and analyzing adapter scaling behavior.

Maintenance & Community

Contribution guidelines are present (e.g., make style for PRs), but specific community channels (Discord, Slack) or roadmap details are not detailed in the provided README excerpt.

Licensing & Compatibility

The license type is not specified in the provided README excerpt, which may pose an adoption blocker for commercial or sensitive use cases.

Limitations & Caveats

Installation is currently via a direct Git repository link, indicating it may not be a stable, versioned release. Specific Python version requirements and detailed hardware specifications beyond GPU necessity are not explicitly listed.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), and
1 more.

DoRA by NVlabs

1.1%
913
PyTorch code for weight-decomposed low-rank adaptation (DoRA)
Created 1 year ago
Updated 1 year ago
Starred by Ying Sheng Ying Sheng(Coauthor of SGLang), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
10 more.

adapters by adapter-hub

0.1%
3k
Unified library for parameter-efficient transfer learning in NLP
Created 5 years ago
Updated 3 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
15 more.

LoRA by microsoft

0.2%
13k
PyTorch library for low-rank adaptation (LoRA) of LLMs
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.