rotary-embedding-torch  by lucidrains

PyTorch library for rotary embeddings in transformers

created 4 years ago
717 stars

Top 48.9% on sourcepulse

GitHubView on GitHub
Project Summary

This library provides a standalone PyTorch implementation of Rotary Embeddings (RoPE), a relative positional encoding technique that enhances Transformer models. It's designed for researchers and engineers seeking to improve sequence modeling performance with state-of-the-art positional encoding, offering benefits like efficient rotation of information along tensor axes and improved results with minimal overhead.

How It Works

Rotary Embeddings apply rotations to query and key vectors in attention mechanisms, encoding relative positional information directly into the attention scores. This approach is advantageous as it injects positional awareness without adding parameters or complexity to the model architecture, unlike absolute positional embeddings. The library supports standard RoPE, axial RoPE for multi-dimensional inputs (e.g., video), and length-extrapolatable variants like XPos and positional interpolation.

Quick Start & Requirements

  • Install: $ pip install rotary-embedding-torch
  • Requirements: PyTorch. No specific CUDA or Python version is mandated, but GPU acceleration is typical for Transformer workloads.
  • Usage: Import RotaryEmbedding and apply its rotate_queries_or_keys or rotate_queries_and_keys methods to query and key tensors before the attention dot product. See the README for detailed examples and inference caching.

Highlighted Details

  • Implements RoFormer, XPos (length extrapolation), and positional interpolation for context extension.
  • Supports axial rotary embeddings for multi-dimensional inputs like images and video.
  • Includes utility for key-value caching during inference to handle positional offsets.
  • Offers flexibility in choosing embedding dimensions and frequency configurations.

Maintenance & Community

The library is maintained by lucidrains, a prolific contributor to open-source AI research implementations. The README includes citations to key papers and community discussions, indicating active engagement with the research landscape.

Licensing & Compatibility

The library is released under the MIT license, permitting commercial use and integration into closed-source projects.

Limitations & Caveats

The README notes that the positional interpolation feature has received mixed community feedback regarding its effectiveness. The XPos implementation is noted as being suitable only for autoregressive transformers.

Health Check
Last commit

6 days ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
50 stars in the last 90 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

ctransformers by marella

0.1%
2k
Python bindings for fast Transformer model inference
created 2 years ago
updated 1 year ago
Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley) and Yang Song Yang Song(Professor at Caltech; Research Scientist at OpenAI).

vector-quantize-pytorch by lucidrains

0.4%
3k
PyTorch library for vector quantization techniques
created 5 years ago
updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
6 more.

x-transformers by lucidrains

0.2%
5k
Transformer library with extensive experimental features
created 4 years ago
updated 3 days ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Phil Wang Phil Wang(Prolific Research Paper Implementer), and
4 more.

vit-pytorch by lucidrains

0.2%
24k
PyTorch library for Vision Transformer variants and related techniques
created 4 years ago
updated 6 days ago
Feedback? Help us improve.