triton-transformer  by lucidrains

Transformer implementation in Triton

Created 4 years ago
274 stars

Top 94.3% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides a PyTorch implementation of the Transformer architecture, with a focus on leveraging Triton for optimized performance. It aims to offer a faster and more efficient training experience for researchers and engineers working with large language models.

How It Works

The core innovation lies in rewriting key Transformer components, such as layernorm and softmax, using Triton kernels. This approach allows for fine-grained control over GPU memory access and computation, potentially leading to significant speedups and reduced memory footprint compared to standard PyTorch implementations. The project is actively developing backward pass kernels and exploring fused operations for further optimization.

Quick Start & Requirements

  • Install: pip install triton-transformer
  • Requirements: PyTorch, Triton, CUDA-enabled GPU.
  • Usage: Import Transformer from triton_transformer and instantiate with desired parameters. Example usage for GPT and BERT-style models is provided in the README.

Highlighted Details

  • Implements Transformer layers entirely in Triton.
  • Includes Triton-optimized forward and backward passes for layernorm.
  • Features Triton-based softmax and cross-entropy loss.
  • Supports both causal (GPT-like) and non-causal (BERT-like) attention mechanisms.

Maintenance & Community

The project is initiated by "lucidrains" and appears to be a personal learning project, indicated by the "wip" status and the author's self-description as new to low-level neural net code. There are no explicit community links or notable contributors mentioned.

Licensing & Compatibility

The README does not explicitly state a license. The project cites papers related to Triton, Transformers, and efficient model architectures.

Limitations & Caveats

This project is marked as "wip" (work in progress) and is described as a learning experience. Several key components, including backward passes for matrix multiplication and fused attention, are still under development. Performance benchmarks and optimizations are also pending.

Health Check
Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
5 more.

attorch by BobMcDear

0.2%
576
PyTorch nn module subset, implemented in Python using Triton
Created 2 years ago
Updated 1 month ago
Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
4 more.

fastformers by microsoft

0%
707
NLU optimization recipes for transformer models
Created 5 years ago
Updated 6 months ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
5 more.

matmulfreellm by ridgerchu

0.0%
3k
MatMul-free language models
Created 1 year ago
Updated 1 month ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
15 more.

FasterTransformer by NVIDIA

0.1%
6k
Optimized transformer library for inference
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.