gpt2-webgl by nathan-barry

Browser-based demo of GPT-2 inference

Created 8 months ago

343 stars

Top 80.6% on SourcePulse

View on GitHub

2 Experts Love This Project

Project Summary

This project provides a browser-based, WebGL2 implementation of the GPT-2 small (117M) model, enabling inference directly within the user's web browser. It targets developers and researchers interested in on-device AI inference and interactive visualization of transformer models. The key benefit is running a significant portion of the GPT-2 forward pass on the GPU via WebGL2 shaders, with BPE tokenization handled client-side using js-tiktoken.

How It Works

The core of the implementation leverages WebGL2 shaders to execute the GPT-2 forward pass, including the transformer blocks and attention mechanisms, directly on the GPU. This approach offloads computation from the CPU, potentially offering faster inference and enabling visualizations of internal model states like attention matrices. Tokenization is handled client-side using js-tiktoken, avoiding the need for WASM or server-side processing.

Quick Start & Requirements

Install Python dependencies: pip install torch numpy transformers
Download weights: python download_weights.py
Install JS dependencies: npm install
Start dev server: npm run dev
Prerequisites: Node.js ≥ 16.x, npm, Python ≥ 3.8, modern browser with WebGL2 support.
Official Docs: http://localhost:5173 (local dev server)

Highlighted Details

Full GPT-2 small (117M) forward pass in the GPU via WebGL2 shaders.
BPE tokenization using js-tiktoken in the browser.
Simple Python script to download pretrained weights from HuggingFace.
Includes visualization of transform blocks and attention matrices.

Maintenance & Community

No specific information on maintainers, community channels, or roadmap is provided in the README.

Licensing & Compatibility

License: MIT
Compatibility: The MIT license permits commercial use and linking with closed-source projects.

Limitations & Caveats

The implementation is specifically for GPT-2 small (117M) and may not be directly applicable to larger models without significant modifications. The README does not detail performance benchmarks or specific browser compatibility nuances beyond requiring WebGL2 support.

Health Check

Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days