blast  by stanford-mast

High-performance serving engine for web browsing AI

created 4 months ago
534 stars

Top 60.1% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

BLAST is a high-performance serving engine designed for web browsing AI applications, targeting developers who need to integrate AI-powered web interaction into their products. It offers an OpenAI-compatible API, automatic caching, parallelism, and streaming capabilities to reduce costs and improve latency for automated workflows and local usage.

How It Works

BLAST functions as a serving engine that handles AI-driven web browsing tasks. It employs automatic parallelism and prefix caching to optimize performance and reduce operational costs. The system supports streaming of LLM output, enabling real-time user experiences, and is built for efficient concurrency to manage multiple users without excessive resource consumption.

Quick Start & Requirements

Highlighted Details

  • OpenAI-Compatible API for easy integration.
  • High performance through automatic parallelism and prefix caching.
  • Real-time streaming of LLM output.
  • Built-in support for concurrent users with efficient resource management.

Maintenance & Community

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source applications.

Limitations & Caveats

The project is presented as a serving engine for web browsing AI, implying it requires integration with existing LLM models and web browsing capabilities rather than providing them intrinsically. Specific performance benchmarks or detailed resource requirements are not detailed in the README.

Health Check
Last commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)
6
Issues (30d)
1
Star History
96 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Tobi Lutke Tobi Lutke(Cofounder of Shopify), and
27 more.

vllm by vllm-project

1.0%
54k
LLM serving engine for high-throughput, memory-efficient inference
created 2 years ago
updated 23 hours ago
Feedback? Help us improve.