intel-npu-acceleration-library by intel

Python library for Intel NPU acceleration (now end-of-life)

Created 1 year ago

700 stars

Top 48.9% on SourcePulse

Project Summary

This Python library aimed to accelerate AI computations on Intel Neural Processing Units (NPUs), targeting developers working with Intel Core Ultra processors. It provided low-level access to NPU hardware for high-speed matrix operations and model inference, with the goal of boosting application efficiency.

How It Works

The library leverages Intel's NPU architecture, which includes dedicated Neural Compute Engines for AI operations like matrix multiplication and convolution, and Streaming Hybrid Architecture Vector Engines for general computing. It utilizes compiler technology to optimize AI workloads by tiling compute and data flow, maximizing utilization of on-chip SRAM and minimizing DRAM transfers for performance and power efficiency.

Quick Start & Requirements

Install via pip: pip install intel-npu-acceleration-library
Requires an available NPU (check system compatibility).
Supported OS: Ubuntu (Linux), Windows. macOS is not supported.
Recommended: Latest NPU drivers.
Documentation: Intel® NPU Acceleration Library Documentation

Highlighted Details

Supports 8-bit quantization, 4-bit Quantization and GPTQ, NPU-native mixed precision, and Float16.
Integrates with torch.compile for NPU optimization (Windows torch.compile not supported; use explicit intel_npu_acceleration_library.compile).
Includes examples for running MatMul operations and Hugging Face models (e.g., TinyLlama) on the NPU.
Feature roadmap included key enhancements like BFloat16 support and NPU/GPU hetero compute (some features marked as not implemented).

Maintenance & Community

This project is no longer under active management by Intel and has been archived. Intel has ceased development, maintenance, bug fixes, and contributions. The project is available for reference, and users are encouraged to fork it for independent development. Intel recommends adopting OpenVINO™ and OpenVINO™ GenAI for NPU acceleration.

Licensing & Compatibility

The license is not explicitly stated in the provided README text. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The project is officially announced as End-of-Life and will not receive further updates or maintenance from Intel. macOS is not supported. Windows torch.compile is not supported. Users are directed to OpenVINO™ and OpenVINO™ GenAI for current NPU acceleration solutions.

intel-npu-acceleration-library by intel

Explore Similar Projects

fp6_llm by usyd-fsalab

ratchet by huggingface

BitBLAS by microsoft

Tutel by microsoft

lightning-thunder by Lightning-AI

bolt by huawei-noah

ik_llama.cpp by ikawrakow

nncase by kendryte

DeepBench by baidu-research

TensorRT by pytorch

FasterTransformer by NVIDIA

ncnn by Tencent