long-llms-learning  by Strivin0311

Literature repository for long-context LLM methodologies

Created 1 year ago
266 stars

Top 96.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a curated collection of literature and resources for researchers and practitioners focused on enhancing Large Language Models (LLMs) with long-context capabilities. It aims to provide a comprehensive overview of methodologies, benchmarks, and recent advancements in handling extended context windows within Transformer architectures.

How It Works

The project organizes research papers and implementations into categories such as Efficient Attention (Flash-ReRoPE, Linear Attention), Long-Term Memory, Extrapolative Positional Embeddings, and Context Processing techniques. It highlights novel approaches like Flash-ReRoPE, which combines the infinite extrapolation of RoPE with the efficiency of FlashAttention, and provides implementations and evaluation scripts for these methods.

Quick Start & Requirements

  • Installation: Primarily through cloning the repository and potentially installing specific dependencies for individual implementations (e.g., flash-attn).
  • Prerequisites: Python, PyTorch, and potentially CUDA-enabled GPUs for optimized implementations like FlashAttention. Specific papers may have unique requirements detailed within their respective directories.
  • Resources: Running advanced implementations or benchmarks may require significant computational resources, including high-end GPUs and substantial memory.
  • Links:

Highlighted Details

  • Features implementations of cutting-edge techniques like Flash-ReRoPE and Lightning Attention-2.
  • Tracks recent advancements and includes links to papers on FlashAttention-3, Inference 1.0, and data engineering for large contexts.
  • Curates important benchmarks such as InfiniteBench and LongBench for evaluating long-context LLMs.
  • Provides a structured overview of methodologies, including various attention mechanisms, memory strategies, and context processing techniques.

Maintenance & Community

The repository is actively updated with new research, as indicated by the "Latest News" section. Contributions are welcomed via pull requests or email. The project is associated with a broader "llms-learning" repository for full-stack LLM technologies.

Licensing & Compatibility

The repository itself appears to be under a permissive license, but individual implementations within it may carry their own licenses. Users should verify the licensing terms for any specific code or models they intend to use, especially for commercial applications.

Limitations & Caveats

This repository is a curated collection of research and implementations, not a single, unified framework. Users will need to integrate and adapt individual components, and the setup complexity can vary significantly depending on the specific methodology being explored. Some implementations may be experimental or require specific hardware configurations.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Luis Capelo Luis Capelo(Cofounder of Lightning AI).

LongLM by datamllab

0%
661
Self-Extend: LLM context window extension via self-attention
Created 1 year ago
Updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Pawel Garbacki Pawel Garbacki(Cofounder of Fireworks AI), and
4 more.

LongLoRA by dvlab-research

0.1%
3k
LongLoRA: Efficient fine-tuning for long-context LLMs
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.