gpt2-webgl  by nathan-barry

Browser-based demo of GPT-2 inference

created 3 months ago
329 stars

Top 84.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a browser-based, WebGL2 implementation of the GPT-2 small (117M) model, enabling inference directly within the user's web browser. It targets developers and researchers interested in on-device AI inference and interactive visualization of transformer models. The key benefit is running a significant portion of the GPT-2 forward pass on the GPU via WebGL2 shaders, with BPE tokenization handled client-side using js-tiktoken.

How It Works

The core of the implementation leverages WebGL2 shaders to execute the GPT-2 forward pass, including the transformer blocks and attention mechanisms, directly on the GPU. This approach offloads computation from the CPU, potentially offering faster inference and enabling visualizations of internal model states like attention matrices. Tokenization is handled client-side using js-tiktoken, avoiding the need for WASM or server-side processing.

Quick Start & Requirements

  • Install Python dependencies: pip install torch numpy transformers
  • Download weights: python download_weights.py
  • Install JS dependencies: npm install
  • Start dev server: npm run dev
  • Prerequisites: Node.js ≥ 16.x, npm, Python ≥ 3.8, modern browser with WebGL2 support.
  • Official Docs: http://localhost:5173 (local dev server)

Highlighted Details

  • Full GPT-2 small (117M) forward pass in the GPU via WebGL2 shaders.
  • BPE tokenization using js-tiktoken in the browser.
  • Simple Python script to download pretrained weights from HuggingFace.
  • Includes visualization of transform blocks and attention matrices.

Maintenance & Community

No specific information on maintainers, community channels, or roadmap is provided in the README.

Licensing & Compatibility

  • License: MIT
  • Compatibility: The MIT license permits commercial use and linking with closed-source projects.

Limitations & Caveats

The implementation is specifically for GPT-2 small (117M) and may not be directly applicable to larger models without significant modifications. The README does not detail performance benchmarks or specific browser compatibility nuances beyond requiring WebGL2 support.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
92 stars in the last 90 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

ctransformers by marella

0.1%
2k
Python bindings for fast Transformer model inference
created 2 years ago
updated 1 year ago
Feedback? Help us improve.