long-llms-learning  by Strivin0311

Literature repository for long-context LLM methodologies

created 1 year ago
265 stars

Top 97.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a curated collection of literature and resources for researchers and practitioners focused on enhancing Large Language Models (LLMs) with long-context capabilities. It aims to provide a comprehensive overview of methodologies, benchmarks, and recent advancements in handling extended context windows within Transformer architectures.

How It Works

The project organizes research papers and implementations into categories such as Efficient Attention (Flash-ReRoPE, Linear Attention), Long-Term Memory, Extrapolative Positional Embeddings, and Context Processing techniques. It highlights novel approaches like Flash-ReRoPE, which combines the infinite extrapolation of RoPE with the efficiency of FlashAttention, and provides implementations and evaluation scripts for these methods.

Quick Start & Requirements

  • Installation: Primarily through cloning the repository and potentially installing specific dependencies for individual implementations (e.g., flash-attn).
  • Prerequisites: Python, PyTorch, and potentially CUDA-enabled GPUs for optimized implementations like FlashAttention. Specific papers may have unique requirements detailed within their respective directories.
  • Resources: Running advanced implementations or benchmarks may require significant computational resources, including high-end GPUs and substantial memory.
  • Links:

Highlighted Details

  • Features implementations of cutting-edge techniques like Flash-ReRoPE and Lightning Attention-2.
  • Tracks recent advancements and includes links to papers on FlashAttention-3, Inference 1.0, and data engineering for large contexts.
  • Curates important benchmarks such as InfiniteBench and LongBench for evaluating long-context LLMs.
  • Provides a structured overview of methodologies, including various attention mechanisms, memory strategies, and context processing techniques.

Maintenance & Community

The repository is actively updated with new research, as indicated by the "Latest News" section. Contributions are welcomed via pull requests or email. The project is associated with a broader "llms-learning" repository for full-stack LLM technologies.

Licensing & Compatibility

The repository itself appears to be under a permissive license, but individual implementations within it may carry their own licenses. Users should verify the licensing terms for any specific code or models they intend to use, especially for commercial applications.

Limitations & Caveats

This repository is a curated collection of research and implementations, not a single, unified framework. Users will need to integrate and adapt individual components, and the setup complexity can vary significantly depending on the specific methodology being explored. Some implementations may be experimental or require specific hardware configurations.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

cookbook by EleutherAI

0.1%
809
Deep learning resource for practical model work
created 1 year ago
updated 4 days ago
Feedback? Help us improve.