LLM validation/benchmark library for LLM APIs
Top 38.8% on sourcepulse
LLMPerf is a library for evaluating the performance and correctness of Large Language Model (LLM) APIs. It is designed for researchers and engineers who need to benchmark different LLM providers and models under various load conditions. The tool helps quantify inter-token latency, generation throughput, and response accuracy.
How It Works
LLMPerf utilizes Ray for distributed execution, enabling it to simulate concurrent requests to LLM APIs. It offers two primary test types: a load test measuring latency and throughput, and a correctness test verifying response accuracy against specific prompts. Token counting is standardized using LlamaTokenizer for consistent comparisons across different LLM backends.
Quick Start & Requirements
git clone https://github.com/ray-project/llmperf.git && cd llmperf && pip install -e .
Highlighted Details
Maintenance & Community
llmperf-legacy
.Licensing & Compatibility
Limitations & Caveats
Performance results are sensitive to backend implementation, network conditions, and time of day, and may not directly correlate with all user workloads. Vertex AI and SageMaker do not return token counts, necessitating tokenization via LlamaTokenizer for these services.
7 months ago
Inactive