Awesome-Efficient-Arch by weigao266

Survey of efficient LLM architectures

Created 6 months ago

384 stars

Top 74.5% on SourcePulse

Project Summary

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

This repository provides a comprehensive survey of efficient architectures for Large Language Models (LLMs), addressing the growing need for faster and more resource-friendly models. It targets researchers and practitioners in NLP and AI who are looking to understand and leverage advancements in LLM efficiency. The primary benefit is a curated and categorized collection of papers, enabling quick identification of relevant techniques and trends in efficient LLM design.

How It Works

The survey categorizes efficient LLM architectures into several key areas: Linear Sequence Modeling, Sparse Sequence Modeling, Efficient Full Attention, Sparse Mixture-of-Experts, Hybrid Architectures, and Diffusion Large Language Models. It also explores applications across Vision, Audio, and Multimodality. This structured approach allows for a systematic review of different efficiency strategies, from attention mechanisms and recurrence to novel architectural paradigms.

Quick Start & Requirements

This is a survey paper and repository, not a software package. No installation or execution is required.

Paper: arXiv:2503.09567
Paper List: GitHub Repository

Highlighted Details

Includes 449 papers, covering a wide spectrum of efficient LLM research.
Categorizes techniques such as Linear Attention, State Space Models (SSM), Sparse Attention, Mixture-of-Experts (MoE), and Hybrid Architectures.
Details applications of these efficient architectures in Vision, Audio, and Multimodal tasks.
Features hardware-efficient implementations and optimizations for various approaches.

Maintenance & Community

The repository is actively maintained, with a call for contributions to include new and relevant work. The last commit was recent, indicating ongoing development.

Contribution Welcome: GitHub Repository

Licensing & Compatibility

The repository itself, as a collection of links and information, does not have a specific software license. The cited survey paper's license is not specified in the README.

Limitations & Caveats

As a survey, this repository does not provide code implementations for the discussed architectures. Readers must refer to the individual papers for implementation details and potential usage. The rapid pace of LLM research means that new efficient architectures are constantly emerging, and the survey may not be exhaustive of the very latest developments at any given moment.

Awesome-Efficient-Arch by weigao266

Explore Similar Projects

Awesome-KV-Cache-Management by TreeAI-Lab

Awesome-LLM-System-Papers by AmadeusChan

LLM-Reading-List by evanmiller

neural-speed by intel

prima.cpp by Lizonghang

xFasterTransformer by intel

marlin by IST-DASLab

Awesome-LLMs-on-device by NexaAI

rtp-llm by alibaba

tiny-llm by skyzh

PowerInfer by SJTU-IPADS

airllm by lyogavin