ratchet  by huggingface

Browser ML framework for cross-platform GPU inference

Created 1 year ago
718 stars

Top 47.9% on SourcePulse

GitHubView on GitHub
Project Summary

Ratchet is a cross-platform machine learning framework designed for web-first deployment, enabling GPU-accelerated inference in browsers and native applications. It targets developers seeking to integrate performant AI into existing production environments, offering a toolkit focused on inference, WebGPU/CPU execution, quantization, lazy computation, and in-place operations.

How It Works

Ratchet leverages WebGPU for hardware-accelerated computation, providing a unified API for both browser and native environments. Its design prioritizes efficient inference through first-class quantization support and lazy computation, minimizing overhead and maximizing performance on diverse hardware.

Quick Start & Requirements

  • Install/Run: Experience via Hugging Face Spaces (Whisper, Phi).
  • Prerequisites: Web browser with WebGPU support. JavaScript API demonstrated. Rust crate and CLI are forthcoming.
  • Resources: Demo sites are available for immediate testing.

Highlighted Details

  • Supports Whisper, Phi 2 & 3, and Moondream models, with Gemini 2 2B upcoming.
  • Features asynchronous loading and caching via IndexedDB for web applications.
  • Emphasizes quantization (e.g., Q8) for performance optimization.

Maintenance & Community

  • Currently in active development, seeking community contributions.
  • Community channels include Discord. Roadmap is available.

Licensing & Compatibility

  • License is not explicitly stated in the README.

Limitations & Caveats

  • The project is in active development, with ongoing work on the engine, model support, and compatibility. A Rust crate and CLI are not yet released.
Health Check
Last Commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Ying Sheng Ying Sheng(Coauthor of SGLang).

fastllm by ztxz16

0.4%
4k
High-performance C++ LLM inference library
Created 2 years ago
Updated 1 week ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
4 more.

gemma_pytorch by google

0.2%
6k
PyTorch implementation for Google's Gemma models
Created 1 year ago
Updated 3 months ago
Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
4 more.

ktransformers by kvcache-ai

0.3%
15k
Framework for LLM inference optimization experimentation
Created 1 year ago
Updated 2 days ago
Feedback? Help us improve.