genai-bench by sgl-project

LLM serving performance benchmarking

Created 8 months ago

271 stars

Top 95.2% on SourcePulse

View on GitHub

3 Experts Love This Project

Lianmin Zheng

Coauthor of SGLang, vLLM

Ying Sheng

Coauthor of SGLang

Yineng Zhang

Inference Lead at SGLang; Research Scientist at Together AI

Project Summary

Genai-bench provides a unified, accurate, and user-friendly solution for comprehensive token-level performance evaluation of Large Language Model (LLM) serving systems. It targets engineers and researchers needing to deeply understand and optimize LLM deployment performance. The tool delivers detailed insights into metrics such as throughput, latency (TTFT, E2E, TPOT), error rates, and requests per second (RPS) across diverse traffic scenarios and concurrency levels, facilitating informed infrastructure decisions.

How It Works

The project employs a dual-interface approach: a robust Command Line Interface (CLI) for initiating and managing benchmarks, and an interactive Live UI Dashboard for real-time progress monitoring and metric visualization. It focuses on granular, token-level analysis of LLM serving systems. Post-benchmark, an Experiment Analyzer automatically generates detailed Excel reports containing pricing and raw metrics, alongside flexible plot configurations for visualizing performance trends.

Quick Start & Requirements

Installation is straightforward via pip: pip install genai-bench.
Requires a compatible Python version (specifics available in pyproject.toml).
Refer to the Installation Guide, User Guide, and Contribution Guideline for detailed setup and usage instructions.

Highlighted Details

Specializes in token-level performance evaluation for LLM serving systems.
Features an integrated CLI for benchmark execution and a Live UI for real-time monitoring.
Automated generation of rich logs, detailed Excel reports (including pricing), and customizable plots.
Supports multi-line comparisons and flexible plot layouts for key performance indicators.

Maintenance & Community

The README does not specify maintainers, community channels (e.g., Discord, Slack), sponsorships, or a public roadmap. Contribution guidelines are provided within the repository.

Licensing & Compatibility

The project is released under the MIT license, which is highly permissive and generally suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The provided documentation does not explicitly detail limitations, unsupported platforms, alpha status, or specific caveats regarding the benchmarking process or supported model backends beyond the general your-backend placeholder in usage examples.

Health Check

Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

17 stars in the last 30 days