tt-forge by tenstorrent

Tenstorrent's MLIR compiler stack for AI hardware

Created 1 year ago

274 stars

Top 94.3% on SourcePulse

Project Summary

AI developers can leverage Tenstorrent's TT-Forge to run and train AI workloads on Tenstorrent hardware through an open-source, MLIR-based compiler stack. It aims to provide a general and performant solution, simplifying the deployment of complex models from frameworks like PyTorch, JAX, and ONNX across various Tenstorrent hardware configurations.

How It Works

TT-Forge integrates multiple components: frontends (TT-XLA for PyTorch/JAX, TT-Forge-ONNX for ONNX/TF/Paddle) convert models into MLIR dialects (StableHLO, TTIR). The core TT-MLIR compiler optimizes these graphs, lowering them to TTNN and TTKernel dialects, which are then executed by the TT-Metalium runtime on Tenstorrent hardware. TT-Lang offers a Python DSL for developing custom, high-performance kernels, abstracting low-level hardware complexities.

Quick Start & Requirements

Installation requires using Tenstorrent's private PyPI index: pip install tt-forge --extra-index-url https://pypi.eng.aws.tenstorrent.com/. The setup guide specifies Ubuntu 24.04 and Python 3.12. Additional dependencies like torchvision may be needed for specific examples. Official documentation and hardware details are available.

Highlighted Details

Supports over 800 model variants tested in CI, including large models like Llama 3 70B and Stable Diffusion XL.
Encompasses both inference and training capabilities.
Offers multi-chip support for specific hardware configurations (e.g., N300+).
TT-Lang provides a Python-based approach for custom kernel development.

Maintenance & Community

Community support is available via Discord. Tenstorrent also runs a bounty program for contributions, with details available in the issues tab.

Licensing & Compatibility

The repository's README does not explicitly state a software license. This absence requires clarification for adoption decisions, particularly regarding commercial use or derivative works.

Limitations & Caveats

The TT-Lang Python DSL for custom kernel development is currently in an "early preview" state. Installation relies on a custom PyPI index, which may indicate a less mature or publicly available distribution channel.

tt-forge by tenstorrent

Explore Similar Projects

FlashRT by LiangSu8899

byteir by bytedance

LLamaTuner by jianzhnie

intel-xpu-backend-for-triton by intel

VeOmni by ByteDance-Seed

ort by pykeio

tract by sonos

aiter by ROCm

tt-metal by tenstorrent

executorch by pytorch

accelerate by huggingface

openvino by openvinotoolkit