tt-metal  by tenstorrent

Neural network operator library and low-level kernel programming model

created 2 years ago
1,081 stars

Top 35.0% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides TT-NN, a Python and C++ neural network operator library, and TT-Metalium, a low-level kernel programming model for Tenstorrent hardware. It targets AI researchers and developers looking to optimize and deploy models on Tenstorrent's specialized AI accelerators, offering high performance for LLMs, CNNs, and other AI workloads.

How It Works

TT-NN leverages TT-Metalium, a C++ framework that allows direct programming of Tenstorrent's hardware, including its matrix and vector engines. This low-level control enables fine-grained optimization of data movement, computation, and parallelism (Tensor Parallelism and Data Parallelism) across the chip's architecture, leading to significant performance gains compared to software-only solutions.

Quick Start & Requirements

  • Install: Refer to INSTALLING.md.
  • Prerequisites: Tenstorrent hardware (e.g., Grayskull, Wormhole). Specific CUDA versions or Python versions are not explicitly stated as mandatory for the core library but may be required for specific demos or integrations.
  • Resources: Performance benchmarks for various LLMs, CNNs, and NLP models on different Tenstorrent hardware configurations (n150, n300, QuietBox, Galaxy) are provided, indicating significant throughput.
  • Links: API Reference, Model Demos, TT-Metalium API Reference.

Highlighted Details

  • Achieves high performance for LLMs like Llama 3.1 70B (1443.2 t/s) and CNNs like ResNet-50 (96,800 fps) on Tenstorrent hardware.
  • Supports advanced parallelism techniques including Tensor Parallelism (TP) and Data Parallelism (DP).
  • Includes a comprehensive set of programming guides and tech reports for kernel development and model optimization.
  • Features a bounty program to incentivize community contributions.

Maintenance & Community

  • Active development with frequent releases (e.g., v0.57.0-rc71).
  • Community engagement via Discord.
  • Opportunities for contribution and employment are highlighted.

Licensing & Compatibility

  • The repository is part of Tenstorrent's bounty program, implying open-source contributions. Specific licensing details (e.g., MIT, Apache) are not explicitly stated in the README but are typical for such projects. Compatibility for commercial use would depend on the final license.

Limitations & Caveats

  • The primary limitation is the requirement for Tenstorrent-specific hardware, making it inaccessible without purchasing their accelerators. Performance figures are tied to specific hardware configurations and model versions.
Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
1,076
Issues (30d)
1,064
Star History
89 stars in the last 30 days

Explore Similar Projects

Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Anton Bukov Anton Bukov(Cofounder of 1inch Network), and
20 more.

tinygrad by tinygrad

0.1%
30k
Minimalist deep learning framework for education and exploration
created 4 years ago
updated 1 day ago
Feedback? Help us improve.