Benchmark for speculative decoding methods (ACL 2024 paper)
Top 89.8% on sourcepulse
Spec-Bench provides a unified evaluation platform and benchmark for speculative decoding methods in large language models. It aims to facilitate fair and systematic comparisons of various open-source speculative decoding approaches across diverse scenarios, benefiting researchers and developers working on LLM inference optimization.
How It Works
Spec-Bench integrates multiple open-source speculative decoding algorithms, including EAGLE, Hydra, Medusa, and others, into a single framework. This allows for standardized testing and performance measurement on the same hardware and within the same environment, ensuring reproducible results and direct comparison of speedups and output quality against vanilla autoregressive decoding.
Quick Start & Requirements
conda create -n specbench python=3.12
, conda activate specbench
, cd Spec-Bench
, pip install -r requirements.txt
.DraftRetriever
from source using Rust and maturin
. A datastore setup is also needed for REST.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The README does not explicitly state the license for the Spec-Bench repository itself, which could impact commercial use. The REST method requires a Rust toolchain and manual build process, adding complexity to setup.
3 months ago
Inactive