triton-transformer  by lucidrains

Transformer implementation in Triton

created 3 years ago
273 stars

Top 95.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of the Transformer architecture, with a focus on leveraging Triton for optimized performance. It aims to offer a faster and more efficient training experience for researchers and engineers working with large language models.

How It Works

The core innovation lies in rewriting key Transformer components, such as layernorm and softmax, using Triton kernels. This approach allows for fine-grained control over GPU memory access and computation, potentially leading to significant speedups and reduced memory footprint compared to standard PyTorch implementations. The project is actively developing backward pass kernels and exploring fused operations for further optimization.

Quick Start & Requirements

  • Install: pip install triton-transformer
  • Requirements: PyTorch, Triton, CUDA-enabled GPU.
  • Usage: Import Transformer from triton_transformer and instantiate with desired parameters. Example usage for GPT and BERT-style models is provided in the README.

Highlighted Details

  • Implements Transformer layers entirely in Triton.
  • Includes Triton-optimized forward and backward passes for layernorm.
  • Features Triton-based softmax and cross-entropy loss.
  • Supports both causal (GPT-like) and non-causal (BERT-like) attention mechanisms.

Maintenance & Community

The project is initiated by "lucidrains" and appears to be a personal learning project, indicated by the "wip" status and the author's self-description as new to low-level neural net code. There are no explicit community links or notable contributors mentioned.

Licensing & Compatibility

The README does not explicitly state a license. The project cites papers related to Triton, Transformers, and efficient model architectures.

Limitations & Caveats

This project is marked as "wip" (work in progress) and is described as a learning experience. Several key components, including backward passes for matrix multiplication and fused attention, are still under development. Performance benchmarks and optimizations are also pending.

Health Check
Last commit

3 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Suale Hasif Suale Hasif(Cofounder of Cursor), and
1 more.

attorch by BobMcDear

0.3%
564
PyTorch nn module subset, implemented in Python using Triton
created 2 years ago
updated 2 days ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
2 more.

matmulfreellm by ridgerchu

0.1%
3k
MatMul-free language models
created 1 year ago
updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
6 more.

x-transformers by lucidrains

0.2%
5k
Transformer library with extensive experimental features
created 4 years ago
updated 3 days ago
Feedback? Help us improve.