AI model optimization toolkit for ONNX Runtime
Top 22.3% on sourcepulse
Olive is an AI model optimization toolkit designed to simplify the finetuning, conversion, quantization, and optimization of models for efficient inferencing across various hardware targets (CPUs, GPUs, NPUs). It targets ML engineers and researchers seeking to reduce manual effort in model optimization by automating the selection and application of over 40 built-in optimization techniques, integrating seamlessly with Hugging Face and Azure AI.
How It Works
Olive operates by composing a pipeline of optimization techniques based on user-defined targets and constraints like accuracy and latency. It leverages ONNX Runtime as the core inference engine and supports a wide array of optimization components, including model compression, quantization (e.g., AWQ, RTN), and compilation. This approach automates the complex, trial-and-error process of finding optimal model configurations for specific hardware.
Quick Start & Requirements
pip install olive-ai[auto-opt]
and pip install transformers==4.44.2 onnxruntime-genai
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The README does not explicitly detail performance benchmarks or specific hardware compatibility beyond general CPU/GPU/NPU mentions. While it supports many popular models, optimizing custom architectures may require providing explicit input/output configurations.
19 hours ago
Inactive