Awesome_LLM_System-PaperList by galeselee

LLM acceleration paper list, focusing on inference and serving

Created 2 years ago

282 stars

Top 92.6% on SourcePulse

Project Summary

This repository serves as a curated list of academic papers focused on accelerating Large Language Model (LLM) inference and serving. It targets researchers, engineers, and practitioners in the LLM space seeking to understand and implement efficient LLM deployment strategies. The primary benefit is a centralized, categorized collection of relevant research, saving significant time in literature review.

How It Works

The project compiles and categorizes research papers based on their contribution to LLM acceleration. It uses a tabular format to present papers alongside their keywords, contributing institutions, publication venues, and links to associated open-source projects or demos. This structured approach allows for quick identification of relevant techniques and implementations.

Quick Start & Requirements

This is a paper list, not a software project. No installation or execution is required. Access is via web browser.

Highlighted Details

Comprehensive coverage of inference acceleration techniques including quantization, pruning, speculative decoding, KV cache optimization, and system-level optimizations.
Links to numerous associated open-source projects (e.g., DeepSpeed, vLLM, TensorRT-LLM, MLC LLM, TGI) for practical implementation.
Categorization by institution and publication venue, aiding in tracking research trends and key contributors.
Inclusion of papers on related areas like RLHF training, multi-modal LLMs, and energy efficiency.

Maintenance & Community

The project is maintained by galeselee and welcomes contributions. It appears to be a community-driven effort to track the rapidly evolving field of LLM acceleration.

Licensing & Compatibility

The repository itself is likely under a permissive license (e.g., MIT, Apache 2.0) as is common for curated lists, but the linked papers and projects will have their own respective licenses.

Limitations & Caveats

As a curated list, the content is dependent on the maintainer's and contributors' efforts to keep it updated. It does not provide direct code or tools but rather pointers to them. The rapid pace of LLM research means some very recent papers might not yet be included.

Awesome_LLM_System-PaperList by galeselee

Explore Similar Projects

Awesome-KV-Cache-Management by TreeAI-Lab

awesome-AI-system by lambda7xx

ai-infra-learning by cr7258

dInfer by inclusionAI

ScaleLLM by vectorch-ai

LLMSpeculativeSampling by feifeibear

tensorrtllm_backend by triton-inference-server

EAGLE by SafeAILab

Awesome-LLM-Inference by xlite-dev

text-embeddings-inference by huggingface

lmdeploy by InternLM

llm-action by liguodongiot