triton-transformer by lucidrains

Transformer implementation in Triton

Created 4 years ago

278 stars

Top 93.5% on SourcePulse

View on GitHub

1 Expert Loves This Project

Paras Jain

Cofounder of Genmo

Project Summary

This repository provides a PyTorch implementation of the Transformer architecture, with a focus on leveraging Triton for optimized performance. It aims to offer a faster and more efficient training experience for researchers and engineers working with large language models.

How It Works

The core innovation lies in rewriting key Transformer components, such as layernorm and softmax, using Triton kernels. This approach allows for fine-grained control over GPU memory access and computation, potentially leading to significant speedups and reduced memory footprint compared to standard PyTorch implementations. The project is actively developing backward pass kernels and exploring fused operations for further optimization.

Quick Start & Requirements

Install: pip install triton-transformer
Requirements: PyTorch, Triton, CUDA-enabled GPU.
Usage: Import Transformer from triton_transformer and instantiate with desired parameters. Example usage for GPT and BERT-style models is provided in the README.

Highlighted Details

Implements Transformer layers entirely in Triton.
Includes Triton-optimized forward and backward passes for layernorm.
Features Triton-based softmax and cross-entropy loss.
Supports both causal (GPT-like) and non-causal (BERT-like) attention mechanisms.

Maintenance & Community

The project is initiated by "lucidrains" and appears to be a personal learning project, indicated by the "wip" status and the author's self-description as new to low-level neural net code. There are no explicit community links or notable contributors mentioned.

Licensing & Compatibility

The README does not explicitly state a license. The project cites papers related to Triton, Transformers, and efficient model architectures.

Limitations & Caveats

This project is marked as "wip" (work in progress) and is described as a learning experience. Several key components, including backward passes for matrix multiplication and fused attention, are still under development. Performance benchmarks and optimizations are also pending.

triton-transformer by lucidrains

Explore Similar Projects

triton-resources by rkinas

attorch by BobMcDear

Flowformer by thuml

fastformers by microsoft

dl_note by harleyszhang

keras-transformer by kpot

spacy-transformers by explosion

transformers-tutorials by abhimishra91

matmulfreellm by ridgerchu

pytorch-openai-transformer-lm by huggingface

bert4keras by bojone

FasterTransformer by NVIDIA