Long-Context  by abacusai

LLM context expansion via RoPE encoding modifications

created 2 years ago
591 stars

Top 55.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides code, tooling, and experimental results for extending the context length of Large Language Models (LLMs), specifically Llama. It targets researchers and practitioners aiming to improve LLM performance on tasks requiring long-range information retrieval and understanding. The primary benefit is enabling LLMs to process and reason over significantly larger input contexts than their original pre-training limits.

How It Works

The project explores various methods to extend LLM context length, focusing on modifications to Rotary Position Embeddings (RoPE). Key approaches include linear scaling of RoPE, scaling the Fourier basis of RoPE, applying truncation to the Fourier basis, and randomizing position vectors. These techniques are combined with fine-tuning on datasets like RedPajama and instruction-tuning with Vicuna. Linear scaling, particularly when combined with instruction fine-tuning (IFT), emerged as the most robust method, achieving non-zero accuracy up to 20k context lengths.

Quick Start & Requirements

  • Install: Code is provided for fine-tuning and evaluation. Specific commands depend on the chosen experiment.
  • Prerequisites: Python, PyTorch, Hugging Face Transformers, and potentially CUDA for GPU acceleration.
  • Resources: Fine-tuning and evaluation on long contexts will require significant GPU memory and compute.
  • Links:

Highlighted Details

  • Linear scaling with IFT shows robustness for context lengths up to 16k, with potential for 20-24k.
  • Evaluation methodologies significantly impact the ranking of different context extension approaches.
  • Instruction fine-tuning improves retrieval accuracy but does not fundamentally extend the model's inherent context handling limits.
  • Custom datasets (WikiQA FFQA and AltQA) are provided for evaluating long-context retrieval and robustness against memorization.

Maintenance & Community

  • The project is from Abacus.AI.
  • Further details on community engagement or roadmap are not explicitly provided in the README.

Licensing & Compatibility

  • The repository's code is likely under a permissive license (e.g., MIT, Apache 2.0), but specific licensing for shared model weights or datasets should be verified.
  • Compatibility for commercial use depends on the underlying Llama model license and any specific terms for shared weights.

Limitations & Caveats

  • While linear scaling shows promise, it doesn't perfectly extrapolate to the theoretical maximum context length (e.g., scale 16 doesn't reach 32k).
  • Some explored methods, like xPos, showed convergence issues, potentially due to precision limitations or fundamental differences from base RoPE.
  • The effectiveness of different methods can vary significantly based on the evaluation task.
Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

yarn by jquesnelle

1.0%
2k
Context window extension method for LLMs (research paper, models)
created 2 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm).

LongLoRA by dvlab-research

0.1%
3k
LongLoRA: Efficient fine-tuning for long-context LLMs
created 1 year ago
updated 11 months ago
Feedback? Help us improve.