Discover and explore top open-source AI tools and projects—updated daily.
Survey of efficient LLM architectures
Top 83.2% on SourcePulse
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
This repository provides a comprehensive survey of efficient architectures for Large Language Models (LLMs), addressing the growing need for faster and more resource-friendly models. It targets researchers and practitioners in NLP and AI who are looking to understand and leverage advancements in LLM efficiency. The primary benefit is a curated and categorized collection of papers, enabling quick identification of relevant techniques and trends in efficient LLM design.
How It Works
The survey categorizes efficient LLM architectures into several key areas: Linear Sequence Modeling, Sparse Sequence Modeling, Efficient Full Attention, Sparse Mixture-of-Experts, Hybrid Architectures, and Diffusion Large Language Models. It also explores applications across Vision, Audio, and Multimodality. This structured approach allows for a systematic review of different efficiency strategies, from attention mechanisms and recurrence to novel architectural paradigms.
Quick Start & Requirements
This is a survey paper and repository, not a software package. No installation or execution is required.
Highlighted Details
Maintenance & Community
The repository is actively maintained, with a call for contributions to include new and relevant work. The last commit was recent, indicating ongoing development.
Licensing & Compatibility
The repository itself, as a collection of links and information, does not have a specific software license. The cited survey paper's license is not specified in the README.
Limitations & Caveats
As a survey, this repository does not provide code implementations for the discussed architectures. Readers must refer to the individual papers for implementation details and potential usage. The rapid pace of LLM research means that new efficient architectures are constantly emerging, and the survey may not be exhaustive of the very latest developments at any given moment.
2 weeks ago
Inactive