nncase  by kendryte

AI compiler stack for AI accelerators

created 7 years ago
803 stars

Top 44.8% on sourcepulse

GitHubView on GitHub
Project Summary

nncase is an open-source deep learning compiler stack designed for Kendryte AI accelerators, targeting developers working with embedded AI applications. It enables efficient deployment of neural networks on specialized hardware by compiling models from frameworks like TFLite, Caffe, and ONNX.

How It Works

nncase acts as a bridge between standard deep learning frameworks and Kendryte's AI hardware. It parses models, performs operator fusion and optimizations, and generates optimized code for the target accelerator. The compiler supports static memory allocation, float, and uint8 inference, and offers post-quantization capabilities with calibration datasets.

Quick Start & Requirements

  • Linux: pip install nncase
  • Windows: pip install nncase followed by pip install nncase_kpu-2.x.x-py2.py3-none-win_amd64.whl (downloadable from releases).
  • Prerequisites: Python. Specific KPU runtime wheels are required for Windows.
  • Build from Source: Requires Git, CMake, and Ninja/Make. Source code for K510 and K230 chips is not open-source, limiting direct compilation for these specific accelerators.

Highlighted Details

  • Supports operators from TFLite, Caffe, and ONNX.
  • Provides benchmark results for various models (Image Classification, Object Detection, Segmentation, Pose Estimation) comparing nncase performance against TFLite/ONNX.
  • Offers features like multi-input/output support, multi-branch structures, and zero-copy loading.
  • Includes demos for eye gaze, space resize, and face pose.

Maintenance & Community

  • Community support is available via Telegram and a QQ group (790699378).
  • Resources, documentation, and examples for K210 and K230 chips are available through the Canaan developer community.

Licensing & Compatibility

  • The repository appears to be under a permissive license, but specific details are not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require verification of the exact license terms.

Limitations & Caveats

The source code for K510 and K230 chips is not open-source, preventing direct compilation of nncase for these specific accelerators. Users needing support for unsupported operators must request them or contribute PRs.

Health Check
Last commit

2 days ago

Responsiveness

1 week

Pull Requests (30d)
24
Issues (30d)
2
Star History
18 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Michael Han Michael Han(Cofounder of Unsloth), and
1 more.

ktransformers by kvcache-ai

0.4%
15k
Framework for LLM inference optimization experimentation
created 1 year ago
updated 3 days ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
12 more.

DeepSpeed by deepspeedai

0.2%
40k
Deep learning optimization library for distributed training and inference
created 5 years ago
updated 1 day ago
Feedback? Help us improve.