WebGPT by 0hq

WebGPU inference of GPT models in the browser

Created 2 years ago

3,773 stars

Top 12.8% on SourcePulse

View on GitHub

12 Experts Love This Project

Founder of Builder.io

and 8 more!

Project Summary

WebGPT enables running transformer-based language models directly in the browser using WebGPU, offering a portable and accessible platform for AI experimentation. It targets developers and researchers interested in on-device AI inference, providing a vanilla JavaScript implementation for educational purposes and proof-of-concept applications.

How It Works

WebGPT leverages WebGPU's compute shader capabilities to perform GPT inference directly within the browser. It implements transformer models in pure JavaScript, aiming for efficiency and broad compatibility. Key optimizations include GPU-based embeddings, kernel fusion, and buffer reuse, allowing it to handle models up to 500M parameters with reasonable performance.

Quick Start & Requirements

Install/Run: Clone the repository and open the HTML files in a WebGPU-compatible browser (Chrome Canary or Edge Canary recommended).
Prerequisites: WebGPU-compatible browser (e.g., Chrome Canary v113+). Git LFS is required to download model files.
Demo: KMeans.org
Docs: See main.js for model loading and execution details. Model conversion scripts are available in misc/conversion_scripts.

Highlighted Details

Achieves 3ms/token with 5M parameters (f32) on an M1 Mac.
Supports models up to 500M parameters, with experimental 1.5B parameter support.
Implemented in ~1500 lines of vanilla JavaScript.
Includes GPT-Shakespeare and GPT-2 117M models.

Maintenance & Community

The project appears to be a personal project with significant contributions from a single developer. There are no explicit mentions of community channels, roadmaps, or ongoing maintenance efforts beyond the listed "Roadmap / Fixing Stupid Decisions."

Licensing & Compatibility

The README does not explicitly state a license. The project's reliance on vanilla JavaScript and HTML suggests broad compatibility with modern web browsers.

Limitations & Caveats

The project is presented as a proof-of-concept and educational resource, with some roadmap items indicating ongoing development and optimization. Larger models (e.g., 1.5B parameters) are noted as unstable. Certain operations like selection ops (topk, softmax) are not yet GPU-accelerated.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days