intel-npu-acceleration-library  by intel

Python library for Intel NPU acceleration (now end-of-life)

Created 1 year ago
690 stars

Top 49.3% on SourcePulse

GitHubView on GitHub
Project Summary

This Python library aimed to accelerate AI computations on Intel Neural Processing Units (NPUs), targeting developers working with Intel Core Ultra processors. It provided low-level access to NPU hardware for high-speed matrix operations and model inference, with the goal of boosting application efficiency.

How It Works

The library leverages Intel's NPU architecture, which includes dedicated Neural Compute Engines for AI operations like matrix multiplication and convolution, and Streaming Hybrid Architecture Vector Engines for general computing. It utilizes compiler technology to optimize AI workloads by tiling compute and data flow, maximizing utilization of on-chip SRAM and minimizing DRAM transfers for performance and power efficiency.

Quick Start & Requirements

  • Install via pip: pip install intel-npu-acceleration-library
  • Requires an available NPU (check system compatibility).
  • Supported OS: Ubuntu (Linux), Windows. macOS is not supported.
  • Recommended: Latest NPU drivers.
  • Documentation: Intel® NPU Acceleration Library Documentation

Highlighted Details

  • Supports 8-bit quantization, 4-bit Quantization and GPTQ, NPU-native mixed precision, and Float16.
  • Integrates with torch.compile for NPU optimization (Windows torch.compile not supported; use explicit intel_npu_acceleration_library.compile).
  • Includes examples for running MatMul operations and Hugging Face models (e.g., TinyLlama) on the NPU.
  • Feature roadmap included key enhancements like BFloat16 support and NPU/GPU hetero compute (some features marked as not implemented).

Maintenance & Community

This project is no longer under active management by Intel and has been archived. Intel has ceased development, maintenance, bug fixes, and contributions. The project is available for reference, and users are encouraged to fork it for independent development. Intel recommends adopting OpenVINO™ and OpenVINO™ GenAI for NPU acceleration.

Licensing & Compatibility

The license is not explicitly stated in the provided README text. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The project is officially announced as End-of-Life and will not receive further updates or maintenance from Intel. macOS is not supported. Windows torch.compile is not supported. Users are directed to OpenVINO™ and OpenVINO™ GenAI for current NPU acceleration solutions.

Health Check
Last Commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
15 more.

FasterTransformer by NVIDIA

0.1%
6k
Optimized transformer library for inference
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.