Olive  by microsoft

AI model optimization toolkit for ONNX Runtime

Created 6 years ago
2,107 stars

Top 21.3% on SourcePulse

GitHubView on GitHub
Project Summary

Olive is an AI model optimization toolkit designed to simplify the finetuning, conversion, quantization, and optimization of models for efficient inferencing across various hardware targets (CPUs, GPUs, NPUs). It targets ML engineers and researchers seeking to reduce manual effort in model optimization by automating the selection and application of over 40 built-in optimization techniques, integrating seamlessly with Hugging Face and Azure AI.

How It Works

Olive operates by composing a pipeline of optimization techniques based on user-defined targets and constraints like accuracy and latency. It leverages ONNX Runtime as the core inference engine and supports a wide array of optimization components, including model compression, quantization (e.g., AWQ, RTN), and compilation. This approach automates the complex, trial-and-error process of finding optimal model configurations for specific hardware.

Quick Start & Requirements

Highlighted Details

  • Supports automatic optimization of popular SLMs like Llama, Phi, Qwen, and Gemma.
  • Offers CLI for common optimization tasks and workflows for orchestrating transformations.
  • Enables compiling LoRA adapters for MultiLoRA serving.
  • Includes a caching mechanism for improved productivity.

Maintenance & Community

  • Developed and maintained by Microsoft.
  • Contributions welcome via GitHub Issues and Discussions.
  • Roadmap and community channels are available via GitHub.

Licensing & Compatibility

  • Licensed under the MIT License.
  • Permissive license suitable for commercial use and integration into closed-source applications.

Limitations & Caveats

The README does not explicitly detail performance benchmarks or specific hardware compatibility beyond general CPU/GPU/NPU mentions. While it supports many popular models, optimizing custom architectures may require providing explicit input/output configurations.

Health Check
Last Commit

18 hours ago

Responsiveness

1 day

Pull Requests (30d)
59
Issues (30d)
5
Star History
55 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

xTuring by stochasticai

0.0%
3k
SDK for fine-tuning and customizing open-source LLMs
Created 2 years ago
Updated 1 day ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
36 more.

unsloth by unslothai

0.6%
46k
Finetuning tool for LLMs, targeting speed and memory efficiency
Created 1 year ago
Updated 14 hours ago
Feedback? Help us improve.