Survey of on-device LLMs, architectures, and optimization
Top 34.0% on sourcepulse
This repository serves as a comprehensive survey and hub for Large Language Models (LLMs) designed for on-device deployment. It targets researchers, developers, and learners interested in the evolution, architectures, optimization techniques, and practical applications of LLMs that run locally on devices, offering a curated collection of state-of-the-art models and frameworks.
How It Works
The hub systematically covers the landscape of on-device LLMs, detailing their evolution from cloud-based limitations to the advantages of local inference. It delves into efficient architectures, model compression techniques (quantization, pruning, knowledge distillation), and hardware acceleration strategies. The project highlights key models and frameworks, providing links to papers and code, and organizes information through a structured survey format.
Quick Start & Requirements
This repository is a curated list of resources and does not require installation or direct execution. It links to various external projects and papers, each with its own setup requirements.
Highlighted Details
Maintenance & Community
The project is maintained by NexaAI and features a Discord community for engagement. It encourages community contributions via pull requests.
Licensing & Compatibility
This project is released under the MIT License, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
As a survey and hub, this repository does not provide a unified inference engine or framework. Users must refer to individual linked projects for specific model compatibility, performance, and deployment requirements.
6 months ago
1 day