gpt2-webgl  by nathan-barry

Browser-based demo of GPT-2 inference

Created 4 months ago
335 stars

Top 82.0% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a browser-based, WebGL2 implementation of the GPT-2 small (117M) model, enabling inference directly within the user's web browser. It targets developers and researchers interested in on-device AI inference and interactive visualization of transformer models. The key benefit is running a significant portion of the GPT-2 forward pass on the GPU via WebGL2 shaders, with BPE tokenization handled client-side using js-tiktoken.

How It Works

The core of the implementation leverages WebGL2 shaders to execute the GPT-2 forward pass, including the transformer blocks and attention mechanisms, directly on the GPU. This approach offloads computation from the CPU, potentially offering faster inference and enabling visualizations of internal model states like attention matrices. Tokenization is handled client-side using js-tiktoken, avoiding the need for WASM or server-side processing.

Quick Start & Requirements

  • Install Python dependencies: pip install torch numpy transformers
  • Download weights: python download_weights.py
  • Install JS dependencies: npm install
  • Start dev server: npm run dev
  • Prerequisites: Node.js ≥ 16.x, npm, Python ≥ 3.8, modern browser with WebGL2 support.
  • Official Docs: http://localhost:5173 (local dev server)

Highlighted Details

  • Full GPT-2 small (117M) forward pass in the GPU via WebGL2 shaders.
  • BPE tokenization using js-tiktoken in the browser.
  • Simple Python script to download pretrained weights from HuggingFace.
  • Includes visualization of transform blocks and attention matrices.

Maintenance & Community

No specific information on maintainers, community channels, or roadmap is provided in the README.

Licensing & Compatibility

  • License: MIT
  • Compatibility: The MIT license permits commercial use and linking with closed-source projects.

Limitations & Caveats

The implementation is specifically for GPT-2 small (117M) and may not be directly applicable to larger models without significant modifications. The README does not detail performance benchmarks or specific browser compatibility nuances beyond requiring WebGL2 support.

Health Check
Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
11 more.

ctransformers by marella

0.1%
2k
Python bindings for fast Transformer model inference
Created 2 years ago
Updated 1 year ago
Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Casper Hansen Casper Hansen(Author of AutoAWQ), and
1 more.

GPT2 by ConnorJL

0%
1k
GPT2 training implementation, supporting TPUs and GPUs
Created 6 years ago
Updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
20 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
Created 7 months ago
Updated 1 week ago
Feedback? Help us improve.