Efficient-LLMs-Survey by AIoT-MLSys-Lab

Survey paper on efficient large language models

Created 2 years ago

1,247 stars

Top 31.5% on SourcePulse

View on GitHub

2 Experts Love This Project

Zack Li

Cofounder of Nexa AI

Piero Molino

Cofounder of Predibase

Project Summary

This repository provides a comprehensive survey of research on efficient Large Language Models (LLMs), targeting researchers and practitioners. It systematically organizes and reviews techniques for reducing the substantial resource demands of LLMs across model-centric, data-centric, and framework-centric perspectives, aiming to foster understanding and innovation in the field.

How It Works

The survey categorizes efficiency techniques into three main pillars: model-centric (e.g., quantization, pruning, knowledge distillation, parameter-efficient fine-tuning), data-centric (e.g., data selection, prompt engineering), and framework-centric (e.g., system-level optimizations, hardware co-design). This structured approach allows for a holistic view of the LLM efficiency landscape, highlighting interdependencies and diverse strategies for optimization.

Quick Start & Requirements

This repository is a survey and does not require installation or execution. It serves as a curated list of research papers, code repositories, and relevant resources.

Highlighted Details

The survey is a camera-ready version accepted by Transactions on Machine Learning Research (TMLR) in May 2024.
It covers a vast array of techniques, including quantization (e.g., GPTQ, AWQ, SmoothQuant), pruning (e.g., SparseGPT, LLM-Pruner), parameter-efficient fine-tuning (e.g., LoRA, Adapters), and efficient architectures (e.g., Mamba, Longformer).
It also details data-centric approaches like prompt engineering and data selection, as well as system-level optimizations and LLM frameworks (e.g., DeepSpeed, vLLM, TensorRT-LLM).
Includes links to papers and code for hundreds of relevant research contributions.

Maintenance & Community

The repository is actively maintained by researchers from The Ohio State University, University of Michigan, and other institutions. Feedback and contributions for new research are welcomed via email or pull requests.

Licensing & Compatibility

The repository itself does not have a specific license mentioned, but it links to numerous research papers and code repositories, each with its own licensing terms. Users should consult the individual licenses of linked projects for usage and compatibility.

Limitations & Caveats

As a survey, this repository is a snapshot of the research landscape at a given time and may not include the absolute latest advancements. The rapid pace of LLM research means new techniques are constantly emerging.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days