intel-extension-for-pytorch by intel

PyTorch extension for performance boost on Intel platforms

Created 5 years ago

2,003 stars

Top 21.9% on SourcePulse

View on GitHub

3 Experts Love This Project

Woosuk Kwon

Coauthor of vLLM

Jeff Hammerbacher

Cofounder of Cloudera

Luca Antiga

CTO of Lightning AI

Project Summary

This package extends PyTorch to optimize performance on Intel hardware, targeting developers and researchers working with AI models, particularly Large Language Models (LLMs). It leverages Intel's specialized hardware instructions like AVX-512 VNNI and AMX on CPUs, and XMX on discrete GPUs, to accelerate computations and offers a xpu device for Intel discrete GPU acceleration.

How It Works

The extension integrates with PyTorch to automatically apply optimizations for Intel architectures. It specifically targets LLMs by implementing techniques such as indirect access KV cache, fused ROPE, and customized linear kernels. This approach aims to provide significant performance gains over standard PyTorch implementations on compatible Intel hardware, enabling faster training and inference for demanding AI workloads.

Quick Start & Requirements

Installation: Typically installed via pip.
Prerequisites: Requires PyTorch. Optimized performance relies on Intel hardware with AVX-512 VNNI, AMX, or XMX capabilities. For GPU acceleration, an Intel discrete GPU is needed.
Resources: Performance benefits are hardware-dependent.
Links: CPU Quick Start, GPU Quick Start, Documentations, LLM Example

Highlighted Details

Extensive LLM support, including Llama, Qwen, Phi, Mistral, and others, with optimizations for FP32, BF16, INT8, and INT4 quantization.
Provides module-level optimization APIs (prototype) for custom LLM acceleration.
Supports PyTorch xpu device for Intel discrete GPU acceleration.
Optimizations include indirect access KV cache, fused ROPE, and customized linear kernels.

Maintenance & Community

Managed via GitHub issues for bug tracking and feature requests.
GitHub Issues

Licensing & Compatibility

License: Apache License, Version 2.0.
Compatibility: Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

Performance gains are exclusively tied to Intel hardware. The module-level optimization APIs are marked as a prototype feature, suggesting potential for changes or instability.

Health Check

Last Commit

2 days ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

14 stars in the last 30 days