Awesome-LLM-Long-Context-Modeling  by Xnhyacinth

Curated list of papers/blogs on LLM long context modeling

Created 2 years ago
1,718 stars

Top 24.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a curated collection of papers and blog posts focused on Large Language Model (LLM) based Long Context Modeling. It aims to be a comprehensive resource for researchers and practitioners interested in techniques that enable LLMs to process and understand extended sequences of text, covering areas like efficient attention mechanisms, memory augmentation, and length extrapolation.

How It Works

The repository organizes a vast and rapidly evolving field by categorizing relevant research papers and articles. It covers key sub-topics such as sparse and linear attention, recurrent transformers, state space models, retrieval-augmented generation (RAG), and various methods for compressing context or extending model context windows. The collection is regularly updated with recent publications, providing a dynamic overview of advancements in long context modeling.

Quick Start & Requirements

This is a curated list of research resources, not a software library. No installation or execution is required. The primary requirement is access to academic papers and online articles.

Highlighted Details

  • Extensive categorization of papers across 16 major themes related to long context modeling.
  • Regularly updated "News" section highlighting recently published papers.
  • Includes links to a comprehensive survey paper and its associated repository.
  • Provides bibtex citation for the survey paper.

Maintenance & Community

The repository is actively maintained by Xnhyacinth, with contributions welcomed via pull requests. It links to a related GitHub repository for further collaboration.

Licensing & Compatibility

The repository is licensed under the MIT License, allowing for broad use and distribution.

Limitations & Caveats

As a collection of links and summaries, the repository itself does not implement any models or code. The quality and accessibility of the linked papers depend on their original sources.

Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
57 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Luis Capelo Luis Capelo(Cofounder of Lightning AI).

LongLM by datamllab

0%
661
Self-Extend: LLM context window extension via self-attention
Created 1 year ago
Updated 1 year ago
Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
8 more.

EAGLE by SafeAILab

10.6%
2k
Speculative decoding research paper for faster LLM inference
Created 1 year ago
Updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Pawel Garbacki Pawel Garbacki(Cofounder of Fireworks AI), and
4 more.

LongLoRA by dvlab-research

0.1%
3k
LongLoRA: Efficient fine-tuning for long-context LLMs
Created 2 years ago
Updated 1 year ago
Starred by Matei Zaharia Matei Zaharia(Cofounder of Databricks), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
9 more.

LWM by LargeWorldModel

0.1%
7k
Multimodal autoregressive model for long-context video/text
Created 1 year ago
Updated 11 months ago
Feedback? Help us improve.