Discover and explore top open-source AI tools and projects—updated daily.
openvinotoolkitNeural network compression for optimized inference
Top 34.8% on SourcePulse
NNCF (Neural Network Compression Framework) provides algorithms for optimizing neural network inference, primarily targeting the OpenVINO™ toolkit. It supports post-training and training-time compression techniques like quantization and sparsity for PyTorch, TensorFlow, and ONNX models, aiming to reduce model size and improve inference speed with minimal accuracy loss.
How It Works
NNCF employs a unified framework architecture allowing for the addition of various compression algorithms across different deep learning backends. It facilitates automatic, configurable model graph transformations. For post-training quantization, it uses a calibration dataset to gather statistics. Training-time compression integrates directly into the training loop, enabling fine-tuning of model weights and compression parameters simultaneously for potentially higher accuracy.
Quick Start & Requirements
pip install nncf or conda install -c conda-forge nncfHighlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
import nncf before other torch imports to avoid incomplete compression.4 days ago
1 day
intel
openvinotoolkit