Efficient-LLMs-Survey  by AIoT-MLSys-Lab

Survey paper on efficient large language models

created 2 years ago
1,197 stars

Top 33.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive survey of research on efficient Large Language Models (LLMs), targeting researchers and practitioners. It systematically organizes and reviews techniques for reducing the substantial resource demands of LLMs across model-centric, data-centric, and framework-centric perspectives, aiming to foster understanding and innovation in the field.

How It Works

The survey categorizes efficiency techniques into three main pillars: model-centric (e.g., quantization, pruning, knowledge distillation, parameter-efficient fine-tuning), data-centric (e.g., data selection, prompt engineering), and framework-centric (e.g., system-level optimizations, hardware co-design). This structured approach allows for a holistic view of the LLM efficiency landscape, highlighting interdependencies and diverse strategies for optimization.

Quick Start & Requirements

This repository is a survey and does not require installation or execution. It serves as a curated list of research papers, code repositories, and relevant resources.

Highlighted Details

  • The survey is a camera-ready version accepted by Transactions on Machine Learning Research (TMLR) in May 2024.
  • It covers a vast array of techniques, including quantization (e.g., GPTQ, AWQ, SmoothQuant), pruning (e.g., SparseGPT, LLM-Pruner), parameter-efficient fine-tuning (e.g., LoRA, Adapters), and efficient architectures (e.g., Mamba, Longformer).
  • It also details data-centric approaches like prompt engineering and data selection, as well as system-level optimizations and LLM frameworks (e.g., DeepSpeed, vLLM, TensorRT-LLM).
  • Includes links to papers and code for hundreds of relevant research contributions.

Maintenance & Community

The repository is actively maintained by researchers from The Ohio State University, University of Michigan, and other institutions. Feedback and contributions for new research are welcomed via email or pull requests.

Licensing & Compatibility

The repository itself does not have a specific license mentioned, but it links to numerous research papers and code repositories, each with its own licensing terms. Users should consult the individual licenses of linked projects for usage and compatibility.

Limitations & Caveats

As a survey, this repository is a snapshot of the research landscape at a given time and may not include the absolute latest advancements. The rapid pace of LLM research means new techniques are constantly emerging.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
53 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.