InfiniTransformer by Beomi

PyTorch implementation of Infini-attention for efficient, infinite context Transformers

Created 1 year ago

373 stars

Top 75.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

This repository provides an unofficial PyTorch implementation of Infini-attention, a technique for enabling Transformers to process extremely long contexts efficiently. It targets researchers and engineers working with large language models like Gemma and Llama, offering a way to significantly extend context windows beyond standard limitations.

How It Works

InfiniTransformer implements two versions of Infini-attention. Type I modifies model and trainer configurations for maximum memory efficiency, enabling training with context lengths up to 1 million tokens on high-end hardware. Type II integrates Infini-attention solely within the attention layer, maintaining compatibility with the Hugging Face Trainer and standard configurations while offering moderate memory savings.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt and pip install -e git+https://github.com/huggingface/transformers.git@b109257f4f#egg=transformers.
Requires PyTorch and the specified version of 🤗Transformers.
Training examples are provided via shell scripts (e.g., ./train.llama.infini.noclm.1Mseq.sh).
Inference and basic tests can be run with python test_basic.infini.py.
Official quick-start and examples are available in the repository.

Highlighted Details

Type I allows training Gemma-2B with 32K sequence length on 2x H100 80G, or Llama-3-8B with 1M sequence length on 2x H100 80G.
Achieves "infinite" context training, demonstrated with a 1M sequence length on 1x H100 80G.
Type II offers compatibility with the Hugging Face Trainer.
Sample generation and inference examples are provided.

Maintenance & Community

The project is unofficial and maintained by Beomi.
No specific community channels or roadmap links are provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license.
Compatibility with commercial or closed-source projects is not specified.

Limitations & Caveats

Type I is not compatible with the basic Hugging Face Trainer and requires custom training code. The project is an unofficial implementation, and specific compatibility or stability guarantees are not provided.

InfiniTransformer by Beomi

Explore Similar Projects

llava-phi by xmoanvaf

evo-memory by SakanaAI

Samba by microsoft

fms-fsdp by foundation-model-stack

megalodon by XuezheMax

long-context-attention by feifeibear

PaLM by conceptofmind

Megatron-LLM by epfLLM

mini_qwen by qiufengqijun

simple_GRPO by lsdefine

flash-linear-attention by fla-org

axolotl by axolotl-ai-cloud