Awesome-Efficient-LLM by horseee

Curated list for efficient LLMs

Created 2 years ago

1,927 stars

Top 22.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

This repository is a curated list of papers and projects focused on making Large Language Models (LLMs) more efficient. It serves researchers and practitioners looking to reduce computational costs, memory footprint, and latency in LLM deployment and training. The list covers a wide range of techniques, including pruning, quantization, knowledge distillation, and architectural modifications.

How It Works

The repository organizes research papers into distinct categories such as Network Pruning/Sparsity, Knowledge Distillation, Quantization, Inference Acceleration, Efficient MOE, Efficient Architecture, KV Cache Compression, Text Compression, Low-Rank Decomposition, Hardware/System/Serving, Efficient Fine-tuning, and Efficient Training. Each entry typically includes a title, authors, a brief introduction, and links to the paper or code. The list is updated regularly, with recent papers highlighted on the main page.

Quick Start & Requirements

This is a curated list of research papers and projects, not a runnable software package. No installation or specific requirements are needed to browse the content.

Highlighted Details

Comprehensive coverage of LLM efficiency techniques, categorized for easy navigation.
Regularly updated with recent research, including papers from the last 90 days.
Features "Recommended Papers" marked with GitHub stars or citations for key topics.
Includes links to papers and associated code repositories where available.

Maintenance & Community

The project is community-driven, with contributions welcomed via pull requests or email. The README indicates active updates and community engagement.

Licensing & Compatibility

The repository itself is a collection of links and information, not software with a specific license. Individual papers and projects linked within the repository will have their own licenses.

Limitations & Caveats

As a curated list, it does not provide direct implementations or benchmarks. Users must refer to the linked papers and projects for practical application and performance details.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

16 stars in the last 30 days