Spec-Bench by hemingkx

Benchmark for speculative decoding methods (ACL 2024 paper)

Created 1 year ago

338 stars

Top 81.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Pawel Garbacki

Cofounder of Fireworks AI

Project Summary

Spec-Bench provides a unified evaluation platform and benchmark for speculative decoding methods in large language models. It aims to facilitate fair and systematic comparisons of various open-source speculative decoding approaches across diverse scenarios, benefiting researchers and developers working on LLM inference optimization.

How It Works

Spec-Bench integrates multiple open-source speculative decoding algorithms, including EAGLE, Hydra, Medusa, and others, into a single framework. This allows for standardized testing and performance measurement on the same hardware and within the same environment, ensuring reproducible results and direct comparison of speedups and output quality against vanilla autoregressive decoding.

Quick Start & Requirements

Installation: conda create -n specbench python=3.12, conda activate specbench, cd Spec-Bench, pip install -r requirements.txt.
Prerequisites: Python 3.12, Conda. Model weights for specific models (e.g., Vicuna-v1.3, EAGLE-1,3, Hydra, Medusa-1, SPACE) need to be downloaded separately.
Additional Setup: The REST method requires building DraftRetriever from source using Rust and maturin. A datastore setup is also needed for REST.
Resources: Setup involves environment creation, dependency installation, and potentially downloading large model weights.
Links: Paper, Blog, Leaderboard, Roadmap.

Highlighted Details

Supports evaluation of EAGLE-1,2,3, Hydra, Medusa, Speculative Sampling, Prompt Lookup Decoding, TokenRecycling, REST, Lookahead Decoding, SPACE, and SAM-Decoding.
Includes scripts for calculating speedup and comparing generated results against autoregressive decoding.
Actively updated with new methods and features, as indicated by recent commits and roadmap.
Built upon existing codebases from Medusa and EAGLE.

Maintenance & Community

The project is actively maintained, with recent updates integrating new models like EAGLE-3 and SAM-Decoding.
Contributions are welcomed via pull requests and issues.
Further details on community and roadmap can be found via provided links.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, it mentions being built from Medusa and EAGLE, which may have their own licenses. Compatibility for commercial use or closed-source linking requires verification of the specific licenses of all integrated components.

Limitations & Caveats

The README does not explicitly state the license for the Spec-Bench repository itself, which could impact commercial use. The REST method requires a Rust toolchain and manual build process, adding complexity to setup.

Spec-Bench by hemingkx

Explore Similar Projects

Awesome_LLM_System-PaperList by galeselee

SDAR by JetAstra

speculative-decoding by lucidrains

LayerSkip by facebookresearch

ScaleLLM by vectorch-ai

Sequoia by Infini-AI-Lab

EET by NetEase-FuXi

SpeculativeDecodingPapers by hemingkx

GPTFast by MDK8888

EAGLE by SafeAILab

lightseq by bytedance

CTranslate2 by OpenNMT