gpt-go by zakirullin

Tiny GPT from scratch in pure Go

Created 9 months ago

558 stars

Top 57.3% on SourcePulse

Project Summary

This repository provides a minimalist implementation of the GPT architecture in pure Go, designed for educational purposes. It's ideal for developers and researchers seeking to understand the core mechanics of transformer models without the complexity of large-scale frameworks, trained on Jules Verne novels for demonstration.

How It Works

The implementation eschews batching for simplicity, using 2D matrices instead of 3D tensors to facilitate intuition. It builds the model incrementally, allowing users to explore stages from basic neurons to self-attention mechanisms via git tags. The custom matrix multiplication implementation prioritizes readability over maximum performance, contributing to the project's educational focus.

Quick Start & Requirements

Primary install / run command: go run .
Prerequisites: Go toolchain. Training requires approximately 40 minutes on a MacBook Air M3.
Documentation: Git tags (naive, bigram, multihead, block, residual, full) showcase model evolution.

Highlighted Details

Pure Go implementation for radical simplicity.
No batching, using 2D matrices for easier understanding.
Incremental development via git tags, mirroring a learning progression.
Custom, readable matrix multiplication code.

Maintenance & Community

The project is a personal endeavor, with credits given to Andrej Karpathy for the "Neural Networks: Zero to Hero" course and @itsubaki for an autograd package. No community channels or roadmap are explicitly mentioned.

Licensing & Compatibility

The repository does not explicitly state a license.

Limitations & Caveats

This implementation is not optimized for performance or large-scale deployment, prioritizing educational clarity over efficiency. The lack of explicit licensing may pose compatibility concerns for commercial or closed-source projects.

gpt-go by zakirullin

Explore Similar Projects

zigbook by zigbook

awesome-dspy by ganarajpr

LLM-Synthetic-Data by pengr

transformer by sannykim

Awesome-LLM-Learning by kebijuelun

LearnML by llSourcell

llm_from_scratch by vivekkalyanarangan30

LLMBook-zh.github.io by LLMBook-zh

ML-Notebooks by dair-ai

build-nanogpt by karpathy

qxresearch-event-1 by qxresearch

spaCy by explosion