x-transformers  by lucidrains

Transformer library with extensive experimental features

created 4 years ago
5,479 stars

Top 9.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a highly modular and configurable implementation of the Transformer architecture, catering to researchers and practitioners seeking to experiment with state-of-the-art variations. It offers a comprehensive suite of attention mechanisms, normalization techniques, and architectural modifications, enabling the construction of diverse Transformer models for NLP and vision tasks.

How It Works

The library implements Transformer models using a flexible wrapper-based design. Users can instantiate core components like Encoder, Decoder, and XTransformer (encoder-decoder), then customize them with numerous parameters that enable features such as Flash Attention, Rotary Positional Embeddings, ALiBi, various normalization schemes (RMSNorm, ScaleNorm, LayerNorm variants), GLU feedforwards, and more. This modularity allows for fine-grained control over architectural choices, facilitating rapid prototyping and empirical study of Transformer variants.

Quick Start & Requirements

  • Install via pip: pip install x-transformers
  • Requires PyTorch.
  • GPU with CUDA is recommended for performance.

Highlighted Details

  • Extensive support for experimental Transformer features from recent research papers.
  • Includes implementations for vision tasks (e.g., SimpleViT, PaLI).
  • Offers specialized wrappers for Transformer-XL recurrence and continuous embeddings.
  • Integrates Flash Attention for significant speed and memory improvements.

Maintenance & Community

The project is actively maintained by lucidrains, with contributions from the broader AI research community. Links to relevant papers and discussions are often included within the code and README.

Licensing & Compatibility

The repository is typically released under a permissive license (e.g., MIT), allowing for broad use in research and commercial applications.

Limitations & Caveats

The sheer number of configurable options can lead to a steep learning curve. Some experimental features may be less stable or require specific hyperparameter tuning. The README is dense with code examples and research paper references, requiring a solid understanding of Transformer architectures to fully leverage.

Health Check
Last commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
4
Star History
226 stars in the last 90 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
created 3 years ago
updated 7 months ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Phil Wang Phil Wang(Prolific Research Paper Implementer), and
4 more.

vit-pytorch by lucidrains

0.2%
24k
PyTorch library for Vision Transformer variants and related techniques
created 4 years ago
updated 6 days ago
Feedback? Help us improve.