tensorli  by joennlae

Minimalist GPT-like transformer implementation for educational purposes

Created 2 years ago
253 stars

Top 99.4% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an absolute minimalistic implementation of a GPT-like transformer using only NumPy, targeting developers and researchers who want to understand the core mechanics of transformer models without the complexity of large frameworks. It offers a learning tool for building and training transformer architectures from scratch.

How It Works

Tensorli implements a GPT-like transformer using a custom Tensorli object that mimics PyTorch's tensor functionality, built entirely on NumPy. It includes automatic differentiation and essential neural network components like Linearli, Embeddingli, MultiheadAttentionli, and LayerNorm, along with the Adamli optimizer. This approach prioritizes clarity and educational value over performance or scalability.

Quick Start & Requirements

  • Install via Conda: conda env create -f environment.yml or mamba env create -f environment.yml.
  • Activate environment: conda activate tensorli.
  • Set Python path: export PYTHONPATH=$PWD.
  • Run tests: pytest.
  • Requires Python and NumPy.

Highlighted Details

  • Implements automatic differentiation.
  • Includes core NN layers and Adam optimizer.
  • Demonstrates a functional GPT-like transformer architecture.
  • Inspired by minGPT and tinygrad.

Maintenance & Community

The project appears to be a personal learning project with no explicit mention of contributors, sponsorships, or community channels.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial or closed-source use is not addressed.

Limitations & Caveats

This library is not optimized and is not intended for production use or large-scale applications. It is purely a learning tool. Dropout and additional experimental architectures are planned but not yet implemented.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
9 more.

DiT by facebookresearch

0.3%
8k
PyTorch implementation for diffusion models with transformers (DiT)
Created 2 years ago
Updated 1 year ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
15 more.

FasterTransformer by NVIDIA

0.1%
6k
Optimized transformer library for inference
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.