Awesome_LLM_System-PaperList  by galeselee

LLM acceleration paper list, focusing on inference and serving

created 1 year ago
264 stars

Top 97.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a curated list of academic papers focused on accelerating Large Language Model (LLM) inference and serving. It targets researchers, engineers, and practitioners in the LLM space seeking to understand and implement efficient LLM deployment strategies. The primary benefit is a centralized, categorized collection of relevant research, saving significant time in literature review.

How It Works

The project compiles and categorizes research papers based on their contribution to LLM acceleration. It uses a tabular format to present papers alongside their keywords, contributing institutions, publication venues, and links to associated open-source projects or demos. This structured approach allows for quick identification of relevant techniques and implementations.

Quick Start & Requirements

This is a paper list, not a software project. No installation or execution is required. Access is via web browser.

Highlighted Details

  • Comprehensive coverage of inference acceleration techniques including quantization, pruning, speculative decoding, KV cache optimization, and system-level optimizations.
  • Links to numerous associated open-source projects (e.g., DeepSpeed, vLLM, TensorRT-LLM, MLC LLM, TGI) for practical implementation.
  • Categorization by institution and publication venue, aiding in tracking research trends and key contributors.
  • Inclusion of papers on related areas like RLHF training, multi-modal LLMs, and energy efficiency.

Maintenance & Community

The project is maintained by galeselee and welcomes contributions. It appears to be a community-driven effort to track the rapidly evolving field of LLM acceleration.

Licensing & Compatibility

The repository itself is likely under a permissive license (e.g., MIT, Apache 2.0) as is common for curated lists, but the linked papers and projects will have their own respective licenses.

Limitations & Caveats

As a curated list, the content is dependent on the maintainer's and contributors' efforts to keep it updated. It does not provide direct code or tools but rather pointers to them. The rapid pace of LLM research means some very recent papers might not yet be included.

Health Check
Last commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.