Minimal PyTorch re-implementation for GPT training and inference
Top 1.9% on sourcepulse
minGPT provides a minimal, educational PyTorch implementation of OpenAI's GPT architecture, suitable for researchers and developers seeking to understand or build upon transformer-based language models. It offers a clean, ~300-line core model implementation for both training and inference, simplifying the complex details often found in larger frameworks.
How It Works
minGPT implements a decoder-only Transformer architecture. It processes sequences of token indices through self-attention and feed-forward layers, outputting probability distributions for the next token. The implementation emphasizes efficient batching over sequence length and examples, a key complexity in optimizing Transformer performance. It includes a Byte Pair Encoding (BPE) tokenizer matching OpenAI's GPT implementation.
Quick Start & Requirements
git clone https://github.com/karpathy/minGPT.git
followed by pip install -e .
vocab_size
and block_size
.demo.ipynb
and generate.ipynb
provide examples.Highlighted Details
projects/adder
) and character-level language modeling (projects/chargpt
).Maintenance & Community
The project is in a semi-archived state as of Jan 2023, with the author recommending nanoGPT
for more recent developments. Contributions are still accepted, but major changes are unlikely.
Licensing & Compatibility
MIT License. Permissive for commercial use and integration into closed-source projects.
Limitations & Caveats
The project is noted as being in a semi-archived state, with the author recommending nanoGPT
for more active development and features. Unit test coverage is not comprehensive. The README mentions a lack of a requirements.txt
file.
11 months ago
Inactive