long-llms-learning by Strivin0311

Literature repository for long-context LLM methodologies

Created 2 years ago

271 stars

Top 95.1% on SourcePulse

View on GitHub

2 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Wing Lian

Founder of Axolotl AI

Project Summary

This repository serves as a curated collection of literature and resources for researchers and practitioners focused on enhancing Large Language Models (LLMs) with long-context capabilities. It aims to provide a comprehensive overview of methodologies, benchmarks, and recent advancements in handling extended context windows within Transformer architectures.

How It Works

The project organizes research papers and implementations into categories such as Efficient Attention (Flash-ReRoPE, Linear Attention), Long-Term Memory, Extrapolative Positional Embeddings, and Context Processing techniques. It highlights novel approaches like Flash-ReRoPE, which combines the infinite extrapolation of RoPE with the efficiency of FlashAttention, and provides implementations and evaluation scripts for these methods.

Quick Start & Requirements

Installation: Primarily through cloning the repository and potentially installing specific dependencies for individual implementations (e.g., flash-attn).
Prerequisites: Python, PyTorch, and potentially CUDA-enabled GPUs for optimized implementations like FlashAttention. Specific papers may have unique requirements detailed within their respective directories.
Resources: Running advanced implementations or benchmarks may require significant computational resources, including high-end GPUs and substantial memory.
Links:
- Survey: Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
- Flash-ReRoPE: Implementation
- Long-LLMs-Evals: Evaluation Pipeline

Highlighted Details

Features implementations of cutting-edge techniques like Flash-ReRoPE and Lightning Attention-2.
Tracks recent advancements and includes links to papers on FlashAttention-3, Inference 1.0, and data engineering for large contexts.
Curates important benchmarks such as InfiniteBench and LongBench for evaluating long-context LLMs.
Provides a structured overview of methodologies, including various attention mechanisms, memory strategies, and context processing techniques.

Maintenance & Community

The repository is actively updated with new research, as indicated by the "Latest News" section. Contributions are welcomed via pull requests or email. The project is associated with a broader "llms-learning" repository for full-stack LLM technologies.

Licensing & Compatibility

The repository itself appears to be under a permissive license, but individual implementations within it may carry their own licenses. Users should verify the licensing terms for any specific code or models they intend to use, especially for commercial applications.

Limitations & Caveats

This repository is a curated collection of research and implementations, not a single, unified framework. Users will need to integrate and adapt individual components, and the setup complexity can vary significantly depending on the specific methodology being explored. Some implementations may be experimental or require specific hardware configurations.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days