ncnn  by Tencent

Mobile-first inference framework for neural networks

created 8 years ago
21,846 stars

Top 1.9% on sourcepulse

GitHubView on GitHub
Project Summary

ncnn is a high-performance neural network inference framework optimized for mobile platforms, designed for efficient deployment of deep learning models on devices. It targets developers building AI-powered mobile applications, offering significant speed advantages over other open-source frameworks on mobile CPUs.

How It Works

ncnn is a pure C++ implementation with no third-party dependencies, prioritizing minimal footprint and maximum performance. It achieves this through ARM NEON assembly-level optimizations, sophisticated memory management, and multi-core parallel processing. The framework supports GPU acceleration via Vulkan and offers extensibility for custom layers and model quantization.

Quick Start & Requirements

  • Installation: Build from source or download pre-built binaries for various platforms. Detailed build instructions are available for Linux, Windows, macOS, Android, iOS, WebAssembly, and embedded systems.
  • Prerequisites: Generally C++ compiler. Specific builds may require Vulkan SDK, Xcode, or Android NDK.
  • Resources: Minimal memory footprint. Build times vary by platform.
  • Links: How to build ncnn, Releases

Highlighted Details

  • Claims to be faster than all known open-source frameworks on mobile CPUs.
  • Supports a wide range of CNN architectures including VGG, ResNet, MobileNet, YOLO (v2-v8), and more.
  • Offers GPU acceleration via Vulkan API.
  • Supports importing models from Caffe, PyTorch, ONNX, Darknet, Keras, and TensorFlow (via MLIR).

Maintenance & Community

  • Actively used in Tencent applications (QQ, WeChat, etc.).
  • Community channels include QQ groups and a Discord channel.
  • Discord Channel

Licensing & Compatibility

  • License: BSD 3-Clause.
  • Permissive license allows for commercial use and integration into closed-source applications.

Limitations & Caveats

The platform support matrix indicates that while many platforms are supported, performance ("speed") may not be optimal for all configurations, particularly for certain GPU types on macOS and Windows. Some ARM-specific platforms are marked as "shall work, not confirmed."

Health Check
Last commit

1 day ago

Responsiveness

1 week

Pull Requests (30d)
55
Issues (30d)
32
Star History
488 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), and
2 more.

gpu.cpp by AnswerDotAI

0.2%
4k
C++ library for portable GPU computation using WebGPU
created 1 year ago
updated 2 weeks ago
Starred by Bojan Tunguz Bojan Tunguz(AI Scientist; Formerly at NVIDIA), Mckay Wrigley Mckay Wrigley(Founder of Takeoff AI), and
8 more.

ggml by ggml-org

0.3%
13k
Tensor library for machine learning
created 2 years ago
updated 3 days ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 17 hours ago
Feedback? Help us improve.