blast  by stanford-mast

High-performance serving engine for web browsing AI

Created 5 months ago
548 stars

Top 58.3% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

BLAST is a high-performance serving engine designed for web browsing AI applications, targeting developers who need to integrate AI-powered web interaction into their products. It offers an OpenAI-compatible API, automatic caching, parallelism, and streaming capabilities to reduce costs and improve latency for automated workflows and local usage.

How It Works

BLAST functions as a serving engine that handles AI-driven web browsing tasks. It employs automatic parallelism and prefix caching to optimize performance and reduce operational costs. The system supports streaming of LLM output, enabling real-time user experiences, and is built for efficient concurrency to manage multiple users without excessive resource consumption.

Quick Start & Requirements

Highlighted Details

  • OpenAI-Compatible API for easy integration.
  • High performance through automatic parallelism and prefix caching.
  • Real-time streaming of LLM output.
  • Built-in support for concurrent users with efficient resource management.

Maintenance & Community

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source applications.

Limitations & Caveats

The project is presented as a serving engine for web browsing AI, implying it requires integration with existing LLM models and web browsing capabilities rather than providing them intrinsically. Specific performance benchmarks or detailed resource requirements are not detailed in the README.

Health Check
Last Commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
11 more.

mistral.rs by EricLBuehler

0.3%
6k
LLM inference engine for blazing fast performance
Created 1 year ago
Updated 1 day ago
Starred by Clement Delangue Clement Delangue(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
26 more.

datasets by huggingface

0.1%
21k
Access and process large AI datasets efficiently
Created 5 years ago
Updated 1 day ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Clement Delangue Clement Delangue(Cofounder of Hugging Face), and
58 more.

vllm by vllm-project

1.1%
58k
LLM serving engine for high-throughput, memory-efficient inference
Created 2 years ago
Updated 14 hours ago
Feedback? Help us improve.