blast  by stanford-mast

High-performance serving engine for web browsing AI

Created 1 year ago
775 stars

Top 44.9% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

BLAST is a high-performance serving engine designed for web browsing AI applications, targeting developers who need to integrate AI-powered web interaction into their products. It offers an OpenAI-compatible API, automatic caching, parallelism, and streaming capabilities to reduce costs and improve latency for automated workflows and local usage.

How It Works

BLAST functions as a serving engine that handles AI-driven web browsing tasks. It employs automatic parallelism and prefix caching to optimize performance and reduce operational costs. The system supports streaming of LLM output, enabling real-time user experiences, and is built for efficient concurrency to manage multiple users without excessive resource consumption.

Quick Start & Requirements

Highlighted Details

  • OpenAI-Compatible API for easy integration.
  • High performance through automatic parallelism and prefix caching.
  • Real-time streaming of LLM output.
  • Built-in support for concurrent users with efficient resource management.

Maintenance & Community

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source applications.

Limitations & Caveats

The project is presented as a serving engine for web browsing AI, implying it requires integration with existing LLM models and web browsing capabilities rather than providing them intrinsically. Specific performance benchmarks or detailed resource requirements are not detailed in the README.

Health Check
Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Matthew Johnson Matthew Johnson(Coauthor of JAX; Research Scientist at Google Brain), Roy Frostig Roy Frostig(Coauthor of JAX; Research Scientist at Google DeepMind), and
3 more.

sglang-jax by sgl-project

1.5%
264
High-performance LLM inference engine for JAX/TPU serving
Created 8 months ago
Updated 1 day ago
Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
12 more.

mistral.rs by EricLBuehler

1.6%
7k
LLM inference engine for blazing fast performance
Created 2 years ago
Updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), and
3 more.

risingwave by risingwavelabs

0%
8k
Stream processing and serving for AI agents and real-time data applications
Created 4 years ago
Updated 1 day ago
Starred by Clement Delangue Clement Delangue(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
26 more.

datasets by huggingface

0.1%
21k
Access and process large AI datasets efficiently
Created 6 years ago
Updated 1 day ago
Feedback? Help us improve.