mnn-llm  by wangzhaode

LLM deploy project based on MNN

created 2 years ago
1,599 stars

Top 26.8% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a framework for deploying Large Language Models (LLMs) using the MNN inference engine. It targets developers and researchers looking to run LLMs efficiently on various platforms, including mobile and desktop, with support for CPU and GPU acceleration.

How It Works

The project leverages the MNN inference engine for optimized LLM execution. It includes pre-built examples for command-line interfaces (CLI), web UIs, Android applications, and Python bindings. Model conversion and export tools are also provided, enabling users to convert ONNX or other formats to MNN. The architecture supports various hardware backends through MNN compilation flags, including CUDA and Metal.

Quick Start & Requirements

  • Installation: Clone the repository with submodules: git clone --recurse-submodules https://github.com/wangzhaode/mnn-llm.git. Build using platform-specific scripts (e.g., ./script/build.sh for Linux/macOS, ./script/build.ps1 for Windows, ./script/android_build.sh for Android).
  • Prerequisites: MNN compilation flags can enable GPU support (CUDA, OpenCL, Metal). Python bindings require pip install mnnllm.
  • Resources: Model files and dependencies will vary based on the LLM used.
  • Documentation: MNN LLM

Highlighted Details

  • Supports deployment across Linux, macOS, Windows, Android, and iOS.
  • Includes examples for CLI, Web UI, and native Android applications.
  • Offers Python bindings (mnnllm) for easier integration.
  • Provides text embedding capabilities and model export tools.

Maintenance & Community

The project states it has been merged into the main MNN repository. Specific community channels or active maintenance status for this standalone repo are not detailed.

Licensing & Compatibility

The project's licensing is not explicitly stated in the README. Compatibility for commercial use or linking with closed-source applications would require clarification on the underlying MNN license and any specific licenses for the LLM models used.

Limitations & Caveats

The project is marked as merged into MNN, suggesting this repository might be deprecated or superseded. The README notes that example code was 100% generated by ChatGPT, which may imply potential quality or robustness issues.

Health Check
Last commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
33 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 21 hours ago
Feedback? Help us improve.