Awesome-Efficient-Arch  by weigao266

Survey of efficient LLM architectures

Created 2 months ago
328 stars

Top 83.2% on SourcePulse

GitHubView on GitHub
Project Summary

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

This repository provides a comprehensive survey of efficient architectures for Large Language Models (LLMs), addressing the growing need for faster and more resource-friendly models. It targets researchers and practitioners in NLP and AI who are looking to understand and leverage advancements in LLM efficiency. The primary benefit is a curated and categorized collection of papers, enabling quick identification of relevant techniques and trends in efficient LLM design.

How It Works

The survey categorizes efficient LLM architectures into several key areas: Linear Sequence Modeling, Sparse Sequence Modeling, Efficient Full Attention, Sparse Mixture-of-Experts, Hybrid Architectures, and Diffusion Large Language Models. It also explores applications across Vision, Audio, and Multimodality. This structured approach allows for a systematic review of different efficiency strategies, from attention mechanisms and recurrence to novel architectural paradigms.

Quick Start & Requirements

This is a survey paper and repository, not a software package. No installation or execution is required.

Highlighted Details

  • Includes 449 papers, covering a wide spectrum of efficient LLM research.
  • Categorizes techniques such as Linear Attention, State Space Models (SSM), Sparse Attention, Mixture-of-Experts (MoE), and Hybrid Architectures.
  • Details applications of these efficient architectures in Vision, Audio, and Multimodal tasks.
  • Features hardware-efficient implementations and optimizations for various approaches.

Maintenance & Community

The repository is actively maintained, with a call for contributions to include new and relevant work. The last commit was recent, indicating ongoing development.

Licensing & Compatibility

The repository itself, as a collection of links and information, does not have a specific software license. The cited survey paper's license is not specified in the README.

Limitations & Caveats

As a survey, this repository does not provide code implementations for the discussed architectures. Readers must refer to the individual papers for implementation details and potential usage. The rapid pace of LLM research means that new efficient architectures are constantly emerging, and the survey may not be exhaustive of the very latest developments at any given moment.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
218 stars in the last 30 days

Explore Similar Projects

Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

airllm by lyogavin

0.1%
6k
Inference optimization for LLMs on low-resource hardware
Created 2 years ago
Updated 2 weeks ago
Feedback? Help us improve.