Awesome-LLMs-on-device by NexaAI

Survey of on-device LLMs, architectures, and optimization

Created 1 year ago

1,294 stars

Top 30.7% on SourcePulse

View on GitHub

2 Experts Love This Project

Project Summary

This repository serves as a comprehensive survey and hub for Large Language Models (LLMs) designed for on-device deployment. It targets researchers, developers, and learners interested in the evolution, architectures, optimization techniques, and practical applications of LLMs that run locally on devices, offering a curated collection of state-of-the-art models and frameworks.

How It Works

The hub systematically covers the landscape of on-device LLMs, detailing their evolution from cloud-based limitations to the advantages of local inference. It delves into efficient architectures, model compression techniques (quantization, pruning, knowledge distillation), and hardware acceleration strategies. The project highlights key models and frameworks, providing links to papers and code, and organizes information through a structured survey format.

Quick Start & Requirements

This repository is a curated list of resources and does not require installation or direct execution. It links to various external projects and papers, each with its own setup requirements.

Highlighted Details

Comprehensive overview of on-device LLM evolution with visualizations.
In-depth analysis of groundbreaking architectures and optimization techniques.
Curated list of state-of-the-art models and frameworks for on-device deployment.
Practical examples, case studies, and tutorials for learning.

Maintenance & Community

The project is maintained by NexaAI and features a Discord community for engagement. It encourages community contributions via pull requests.

Licensing & Compatibility

This project is released under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

As a survey and hub, this repository does not provide a unified inference engine or framework. Users must refer to individual linked projects for specific model compatibility, performance, and deployment requirements.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

14 stars in the last 30 days