roformer  by ZhuiyiTechnology

MLM pre-trained language model using rotary position embedding (RoPE)

created 4 years ago
997 stars

Top 38.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides RoFormer, a Masked Language Model (MLM) pre-trained with Rotary Position Embedding (RoPE). RoFormer is designed for researchers and practitioners in Natural Language Processing (NLP) seeking to leverage advanced relative positional encoding for improved Transformer performance. The key benefit is RoPE's theoretical properties and its unique compatibility with linear attention mechanisms.

How It Works

RoFormer integrates Rotary Position Embedding (RoPE) into the Transformer architecture. RoPE applies rotation matrices to query and key embeddings based on their absolute positions. This clever mathematical formulation ensures that the attention scores depend solely on the relative positions of tokens, a significant advantage over absolute positional encodings. This approach is also noted as the only relative position embedding method compatible with linear attention.

Quick Start & Requirements

  • Install: pip install bert4keras==0.10.4
  • Prerequisites: TensorFlow. Pre-trained models are available for download.
  • Links: Paper, EleutherAI Blog

Highlighted Details

  • Implements Rotary Position Embedding (RoPE) for relative positional encoding.
  • RoPE is theoretically sound and compatible with linear attention.
  • Offers pre-trained models for Chinese language tasks.
  • Pseudo-code and bert4keras implementation provided.

Maintenance & Community

  • The primary author is Jianlin Su.
  • A PyTorch implementation is available via x-transformer.

Licensing & Compatibility

  • The repository itself does not explicitly state a license. The associated paper is available on arXiv.

Limitations & Caveats

The project primarily focuses on Chinese language models and relies on the bert4keras library, which may limit broader adoption without additional integration efforts. The licensing status of the repository code is not clearly defined in the README.

Health Check
Last commit

3 years ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
60 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Didier Lopes Didier Lopes(Founder of OpenBB), and
11 more.

sentence-transformers by UKPLab

0.2%
17k
Framework for text embeddings, retrieval, and reranking
created 6 years ago
updated 3 days ago
Feedback? Help us improve.