Position embeddings research paper
Top 76.8% on sourcepulse
This repository introduces Rectified Rotary Position Embeddings (ReRoPE), a method to extend the context length of Large Language Models (LLMs) without requiring fine-tuning. It is targeted at LLM researchers and practitioners seeking to improve model performance on longer sequences. ReRoPE offers a way to achieve lower loss with increased context length, outperforming standard RoPE and NTK-RoPE in benchmarks.
How It Works
ReRoPE modifies the original Rotary Position Embeddings (RoPE) by introducing a "rectification" mechanism. This approach aims to preserve the benefits of RoPE while mitigating the performance degradation observed when extending context length, particularly the "longer context, lower loss" property. The implementation details are available in linked blog posts and code modifications.
Quick Start & Requirements
pip install transformers==4.31.0
python test.py
for chatting, python eval_loss.py
for loss evaluation.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The README does not explicitly state the license, which may impact commercial use. While the method is presented as an alternative to fine-tuning, its effectiveness across all LLM architectures and tasks is not detailed.
1 year ago
1 day