PyTorch library for rotary embeddings in transformers
Top 48.9% on sourcepulse
This library provides a standalone PyTorch implementation of Rotary Embeddings (RoPE), a relative positional encoding technique that enhances Transformer models. It's designed for researchers and engineers seeking to improve sequence modeling performance with state-of-the-art positional encoding, offering benefits like efficient rotation of information along tensor axes and improved results with minimal overhead.
How It Works
Rotary Embeddings apply rotations to query and key vectors in attention mechanisms, encoding relative positional information directly into the attention scores. This approach is advantageous as it injects positional awareness without adding parameters or complexity to the model architecture, unlike absolute positional embeddings. The library supports standard RoPE, axial RoPE for multi-dimensional inputs (e.g., video), and length-extrapolatable variants like XPos and positional interpolation.
Quick Start & Requirements
$ pip install rotary-embedding-torch
RotaryEmbedding
and apply its rotate_queries_or_keys
or rotate_queries_and_keys
methods to query and key tensors before the attention dot product. See the README for detailed examples and inference caching.Highlighted Details
Maintenance & Community
The library is maintained by lucidrains, a prolific contributor to open-source AI research implementations. The README includes citations to key papers and community discussions, indicating active engagement with the research landscape.
Licensing & Compatibility
The library is released under the MIT license, permitting commercial use and integration into closed-source projects.
Limitations & Caveats
The README notes that the positional interpolation feature has received mixed community feedback regarding its effectiveness. The XPos implementation is noted as being suitable only for autoregressive transformers.
6 days ago
1 day