nncf  by openvinotoolkit

Neural network compression for optimized inference

created 5 years ago
1,066 stars

Top 36.0% on sourcepulse

GitHubView on GitHub
Project Summary

NNCF (Neural Network Compression Framework) provides algorithms for optimizing neural network inference, primarily targeting the OpenVINO™ toolkit. It supports post-training and training-time compression techniques like quantization and sparsity for PyTorch, TensorFlow, and ONNX models, aiming to reduce model size and improve inference speed with minimal accuracy loss.

How It Works

NNCF employs a unified framework architecture allowing for the addition of various compression algorithms across different deep learning backends. It facilitates automatic, configurable model graph transformations. For post-training quantization, it uses a calibration dataset to gather statistics. Training-time compression integrates directly into the training loop, enabling fine-tuning of model weights and compression parameters simultaneously for potentially higher accuracy.

Quick Start & Requirements

  • Install: pip install nncf or conda install -c conda-forge nncf
  • Requirements: Python 3.9+, backend-specific dependencies (PyTorch, TensorFlow, ONNX).
  • Links: Documentation, Model Zoo

Highlighted Details

  • Supports Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT).
  • Offers Weight Compression, Sparsity, and Filter Pruning algorithms.
  • Integrates with HuggingFace Optimum Intel and OpenVINO Training Extensions.
  • Provides extensive Jupyter notebooks and sample scripts for various models and domains.

Maintenance & Community

  • Actively maintained by the OpenVINO™ toolkit team.
  • Community support via Discord/Slack (implied by common Intel open source practices).
  • Contributing Guide available.

Licensing & Compatibility

  • License: Apache License 2.0.
  • Compatibility: Compatible with commercial use and closed-source linking.

Limitations & Caveats

  • Limited support for TensorFlow models, primarily Sequential or Keras Functional API.
  • TorchFX integration is experimental.
  • Activation Sparsity and Movement Pruning are experimental for some backends.
  • PyTorch users must import nncf before other torch imports to avoid incomplete compression.
Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
42
Issues (30d)
3
Star History
66 stars in the last 90 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
created 3 years ago
updated 7 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Wei-Lin Chiang Wei-Lin Chiang(Cofounder of LMArena), and
3 more.

sparseml by neuralmagic

0%
2k
Sparsification toolkit for optimized neural networks
created 4 years ago
updated 2 months ago
Feedback? Help us improve.