openvino by openvinotoolkit

Open source toolkit for optimizing and deploying AI inference

Created 7 years ago

9,491 stars

Top 5.4% on SourcePulse

View on GitHub

2 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Morgan Funtowicz

Head of ML Optimizations at Hugging Face

Project Summary

OpenVINO™ is an open-source toolkit designed to optimize and deploy deep learning models across various hardware, including CPUs, GPUs, and NPUs. It targets developers and researchers seeking to boost inference performance for computer vision, NLP, and generative AI tasks, offering broad framework compatibility and flexible deployment options from edge to cloud.

How It Works

OpenVINO employs a two-stage process: model conversion and inference optimization. It converts models trained in frameworks like PyTorch, TensorFlow, and ONNX into an intermediate representation (IR). This IR is then optimized for specific hardware targets using techniques like quantization and layer fusion, enabling efficient execution on Intel hardware and beyond.

Quick Start & Requirements

Install via pip: pip install -U openvino
Supports CPU (x86, ARM), Intel Integrated & Discrete GPUs, and Intel NPUs.
Official Quickstart: https://docs.openvino.ai/latest/openvino_docs_OV_UG_Integrate_OV_IR.html
Notebooks: https://github.com/openvinotoolkit/openvino_notebooks

Highlighted Details

Supports models from PyTorch, TensorFlow, ONNX, Keras, PaddlePaddle, and JAX/Flax.
Integrates with Hugging Face Optimum, Torch.compile, vLLM, ONNX Runtime, LlamaIndex, LangChain, and Keras 3.
Offers a dedicated GenAI API and repository for generative AI applications.
Includes tools like NNCF for advanced optimization and OVMS for model serving.

Maintenance & Community

Active community with contributions welcomed via GitHub Issues.
Support available on the Intel DevHub Discord server.
Resources include a blog, cheat sheet, and performance benchmarks.

Licensing & Compatibility

Licensed under Apache License Version 2.0.
Permissive license suitable for commercial use and integration into closed-source applications.

Limitations & Caveats

While supporting a wide range of hardware, optimal performance is typically achieved on Intel architectures. The toolkit collects telemetry data by default, which can be opted out of.

Health Check

Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)

328

Issues (30d)

173

Star History

182 stars in the last 30 days