MNN  by alibaba

Lightweight deep learning framework for on-device inference and training

created 6 years ago
12,665 stars

Top 4.0% on sourcepulse

GitHubView on GitHub
Project Summary

MNN is a high-performance, lightweight deep learning inference and training framework designed for on-device deployment across mobile, embedded, and PC platforms. It targets developers and researchers needing efficient execution of diverse AI models, including LLMs and diffusion models, with a focus on minimizing resource footprint and maximizing speed.

How It Works

MNN leverages highly optimized assembly code for CPU execution and Metal, OpenCL, Vulkan, and CUDA for GPU acceleration. It supports advanced techniques like Winograd convolution and FP16/Int8 quantization to boost performance and reduce model size. The framework includes a converter for popular model formats (Tensorflow, ONNX, Caffe, Torchscripts) and supports complex model structures with dynamic inputs and control flow.

Quick Start & Requirements

  • Install: Typically built from source or used via pre-built libraries.
  • Prerequisites: C++11 compiler, Python (for converter/tools), specific build flags for hardware acceleration (e.g., CUDA for NVIDIA GPUs).
  • Resources: Core library is ~800KB on Android (armv7a), ~12MB static library on iOS.
  • Docs: MNN Homepage, Read the Docs

Highlighted Details

  • Battle-tested in over 30 Alibaba apps (Taobao, Tmall, Youku) across 70+ use cases.
  • Supports LLM and Stable Diffusion model deployment on mobile via MNN-LLM and MNN-Diffusion.
  • Achieves industry-leading inference and training performance on-device.
  • Includes MNN-CV (lightweight OpenCV alternative) and MNN-Express for general computation.

Maintenance & Community

  • Developed by Alibaba Group employees across multiple departments.
  • Community discussions primarily in Chinese via DingTalk groups.
  • Cited in OSDI'22 (Walle System) and MLSys 2020 papers.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license allows commercial use and integration with closed-source applications. Supports iOS 8.0+, Android 4.3+.

Limitations & Caveats

MNN's support for certain architectures and precision modes (e.g., BF16 on CPU, NPU acceleration) is marked as 'B' (supported but not optimized or with bugs) or 'C' (not supported), indicating potential areas for improvement or requiring careful evaluation. Community support is predominantly in Chinese.

Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
34
Issues (30d)
95
Star History
2,104 stars in the last 90 days

Explore Similar Projects

Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
12 more.

DeepSpeed by deepspeedai

0.2%
40k
Deep learning optimization library for distributed training and inference
created 5 years ago
updated 1 day ago
Feedback? Help us improve.