antares by microsoft

Compiler solution for PyTorch operator optimization on diverse accelerators

Created 5 years ago

466 stars

Top 65.2% on SourcePulse

View on GitHub

4 Experts Love This Project

Paras Jain

Cofounder of Genmo

Cody Yu

Coauthor of vLLM; MTS at OpenAI

Lianmin Zheng

Coauthor of SGLang, vLLM

Zhiqiang Xie

Coauthor of SGLang

Project Summary

Antares (AutoRT) is a compiler solution for PyTorch users to invent, benchmark, and optimize custom operators for various hardware accelerators. It targets researchers and developers needing to push performance boundaries or integrate PyTorch with custom hardware backends, offering accelerated standard PyTorch applications and custom/fused operator generation.

How It Works

Antares utilizes an intermediate representation (IR) to define operations, which are then compiled and optimized for specific backends. This approach allows for abstract operator definition and backend-agnostic compilation, enabling efficient execution across diverse hardware like DirectX 12, CUDA, ROCm, and SYCL. The system supports both programmatic API-style and command-line style operator generation, with an integrated tuning mechanism.

Quick Start & Requirements

Install via pip: pip install autort
Requires Python 3.x.
Experimental support for Windows DirectX 12 and Linux CUDA.
Official documentation and tutorials are available.

Highlighted Details

Supports multi-platform kernel generation and optimization (CPU, CUDA, ROCm, DirectX12, GraphCore, SYCL, OpenCL, Android).
Enables creation of custom-defined or fused operators beyond PyTorch's built-in functions.
Can serve as a benchmark utility for device performance testing and profiling.
Demonstrates integration with PyTorch 2.0 for applications like sorting, MNIST, and LLama models.

Maintenance & Community

Developed by Microsoft.
Encourages community contributions via issues and stars.

Licensing & Compatibility

License details are not explicitly stated in the provided README snippet, but Microsoft's open-source projects typically use permissive licenses like MIT. Further clarification on licensing is recommended for commercial use.

Limitations & Caveats

Support for platforms like ROCm, OpenCL, SYCL, and Apple Metal is listed as experimental or requested for future releases.
The README indicates experimental versions for Windows DirectX 12 and Linux CUDA, suggesting potential instability or incomplete features.

Health Check

Last Commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days