WebGPT  by 0hq

WebGPU inference of GPT models in the browser

created 2 years ago
3,715 stars

Top 13.3% on sourcepulse

GitHubView on GitHub
Project Summary

WebGPT enables running transformer-based language models directly in the browser using WebGPU, offering a portable and accessible platform for AI experimentation. It targets developers and researchers interested in on-device AI inference, providing a vanilla JavaScript implementation for educational purposes and proof-of-concept applications.

How It Works

WebGPT leverages WebGPU's compute shader capabilities to perform GPT inference directly within the browser. It implements transformer models in pure JavaScript, aiming for efficiency and broad compatibility. Key optimizations include GPU-based embeddings, kernel fusion, and buffer reuse, allowing it to handle models up to 500M parameters with reasonable performance.

Quick Start & Requirements

  • Install/Run: Clone the repository and open the HTML files in a WebGPU-compatible browser (Chrome Canary or Edge Canary recommended).
  • Prerequisites: WebGPU-compatible browser (e.g., Chrome Canary v113+). Git LFS is required to download model files.
  • Demo: KMeans.org
  • Docs: See main.js for model loading and execution details. Model conversion scripts are available in misc/conversion_scripts.

Highlighted Details

  • Achieves 3ms/token with 5M parameters (f32) on an M1 Mac.
  • Supports models up to 500M parameters, with experimental 1.5B parameter support.
  • Implemented in ~1500 lines of vanilla JavaScript.
  • Includes GPT-Shakespeare and GPT-2 117M models.

Maintenance & Community

The project appears to be a personal project with significant contributions from a single developer. There are no explicit mentions of community channels, roadmaps, or ongoing maintenance efforts beyond the listed "Roadmap / Fixing Stupid Decisions."

Licensing & Compatibility

The README does not explicitly state a license. The project's reliance on vanilla JavaScript and HTML suggests broad compatibility with modern web browsers.

Limitations & Caveats

The project is presented as a proof-of-concept and educational resource, with some roadmap items indicating ongoing development and optimization. Larger models (e.g., 1.5B parameters) are noted as unstable. Certain operations like selection ops (topk, softmax) are not yet GPU-accelerated.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
25 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.