LongLoRA  by dvlab-research

LongLoRA: Efficient fine-tuning for long-context LLMs

Created 2 years ago
2,686 stars

Top 17.6% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides LongLoRA, an efficient fine-tuning method for extending the context length of Large Language Models (LLMs). It addresses the challenge of processing long documents by enabling models to handle contexts up to 100k tokens, benefiting researchers and developers working with extensive text data.

How It Works

LongLoRA employs a "shifted short attention" mechanism, which is designed to be compatible with Flash-Attention and requires no modification during inference. This approach allows for efficient fine-tuning of LLMs to significantly longer context windows, reducing computational costs and memory requirements compared to standard attention mechanisms.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt and pip install flash-attn --no-build-isolation.
  • Requires Hugging Face account and acceptance of Meta's license for pre-trained weights.
  • Supports LLaMA2 and GPTNeoX base models.
  • Fine-tuning and inference scripts are provided.
  • Official documentation and examples are available within the repository.

Highlighted Details

  • Achieved ICLR 2024 Oral presentation.
  • Released models with context lengths up to 100k tokens (e.g., LLaMA2-LongLoRA-7B-100k).
  • Introduced LongAlpaca-12k, a long-context instruction-following dataset.
  • Supports QLoRA integration for further memory reduction during fine-tuning.

Maintenance & Community

  • Active development with regular updates and model releases.
  • Paper and GitHub repository are the primary sources of information.

Licensing & Compatibility

  • LongLoRA code is licensed under Apache License 2.0.
  • Data and model weights are under CC-BY-NC 4.0 License, restricting usage to non-commercial, research purposes only.

Limitations & Caveats

The data and model weights are strictly for research use and prohibit commercial applications. Models trained using this dataset must also remain within research boundaries.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Luis Capelo Luis Capelo(Cofounder of Lightning AI).

LongLM by datamllab

0%
661
Self-Extend: LLM context window extension via self-attention
Created 1 year ago
Updated 1 year ago
Feedback? Help us improve.