picoGPT  by jaymody

Minimal GPT-2 implementation in NumPy for demonstration

created 2 years ago
3,395 stars

Top 14.6% on sourcepulse

GitHubView on GitHub
Project Summary

PicoGPT is an extremely minimal implementation of the GPT-2 architecture, written entirely in NumPy. It is designed for educational purposes, allowing users to understand the core mechanics of a large language model through a highly condensed codebase. The project is not intended for performance-critical applications or training.

How It Works

The project leverages plain NumPy for all computations, including the forward pass of the GPT-2 model. This approach prioritizes code brevity and readability over speed, demonstrating the fundamental operations of transformer-based language models in a highly accessible manner. The codebase is split into modules for tokenization, model loading, and the core GPT-2 implementation, with a specific gpt2_pico.py file showcasing an even more condensed version.

Quick Start & Requirements

  • Primary install / run command: pip install -r requirements.txt followed by python gpt2.py "Your prompt"
  • Prerequisites: Python 3.9.10 or higher.
  • Usage: The project supports specifying the number of tokens to generate, model size (124M, 355M, 774M, 1558M), and a directory for model weights.

Highlighted Details

  • Entire forward pass implemented in approximately 40 lines of NumPy code.
  • Supports greedy sampling for text generation.
  • Includes OpenAI's BPE tokenizer implementation.
  • Allows loading of various GPT-2 model sizes.

Maintenance & Community

The project appears to be a personal educational endeavor with no explicit mention of active maintenance, community channels, or a roadmap.

Licensing & Compatibility

The repository does not explicitly state a license. Given its nature and origin, it likely inherits licensing from the OpenAI GPT-2 repository or is intended for educational, non-commercial use. Compatibility with commercial or closed-source projects is not specified and should be assumed to be unsupported without explicit licensing.

Limitations & Caveats

PicoGPT is intentionally slow, lacks training capabilities, and does not support advanced sampling methods like top-p or top-k. It is designed for single-instance inference only and is not suitable for batch processing or performance-sensitive tasks.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
48 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n) and Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm).

mlx-gpt2 by pranavjad

0.5%
393
Minimal GPT-2 implementation for educational purposes
created 1 year ago
updated 1 year ago
Feedback? Help us improve.