openvino.genai by openvinotoolkit

OpenVINO GenAI is a library for running generative AI models

Created 2 years ago

411 stars

Top 71.1% on SourcePulse

Project Summary

This library provides a unified C++/Python API for running popular Generative AI models, including LLMs, diffusion models, and speech recognition models, optimized for local execution on CPUs and GPUs. It targets developers and researchers seeking efficient, low-resource inference for tasks like text generation, image creation, and speech-to-text.

How It Works

The library leverages OpenVINO Runtime for high-performance inference across various hardware. It integrates state-of-the-art optimizations like speculative decoding and KVCache token eviction for LLMs, and supports features like LoRA adapter loading and continuous batching for serving. Models are converted and optimized using optimum-cli, with support for quantization (FP16, INT4, INT8).

Quick Start & Requirements

Install via pip: pip install openvino-genai and pip install optimum-intel@git+https://github.com/huggingface/optimum-intel.git.
Model conversion requires optimum-cli.
C++ usage requires a compatible C++ package installation.
See Generative AI workflow and OpenVINO Notebooks for samples.

Highlighted Details

Supports text generation (LLMs), image generation (Stable Diffusion), visual language models (LLaVa), and speech recognition (Whisper).
Integrates advanced LLM optimizations: speculative decoding, KVCache eviction, prefix caching.
Enables LoRA adapter loading and mixing for text and image generation.
Offers continuous batching for LLM serving via OpenVINO Model Server.

Maintenance & Community

Developed by the OpenVINO Toolkit team.
Samples and workflows are available via OpenVINO Notebooks.

Licensing & Compatibility

Licensed under Apache License Version 2.0.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

The README mentions "TBD" for Model Scope support and provides links to C++ installation details that may require additional setup beyond the basic pip install. Some models may work but have not been officially tested.

Health Check

Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

22 stars in the last 30 days