This repository provides reference implementations for MLPerf™ Inference benchmarks, enabling standardized performance measurement of machine learning models across various hardware and deployment scenarios. It targets researchers, developers, and hardware vendors seeking to evaluate and optimize ML inference performance.
How It Works
The suite offers a collection of optimized codebases for popular ML models, each tailored to specific frameworks (TensorFlow, PyTorch, ONNX, TVM, NCNN) and datasets. It standardizes the measurement process, ensuring fair comparisons by defining input data preprocessing, model execution, and output interpretation for each benchmark. This approach allows for consistent evaluation across diverse hardware platforms and software stacks.
Quick Start & Requirements
- Installation and execution typically involve cloning the repository and following specific instructions within each model's directory.
- Dependencies vary by model and framework, often including Python, specific ML frameworks (TensorFlow, PyTorch), and potentially CUDA for GPU acceleration.
- Refer to the official MLPerf Inference documentation website for automated commands and detailed setup guides: https://www.mlcommons.org/en/inference-benchmarks/
Highlighted Details
- Supports a wide range of models including ResNet50, BERT, Llama2/3, Mixtral, Stable Diffusion XL, and 3D-UNet.
- Covers diverse application domains: vision, language, recommendation, medical imaging, text-to-image, and automotive.
- Provides reference implementations across multiple frameworks like TensorFlow, PyTorch, ONNX, TVM, and NCNN.
- Includes historical versions (v0.5 to v5.0) for tracking benchmark evolution and submission reproducibility.
Maintenance & Community
- Managed by MLCommons, a consortium of industry and academic institutions.
- Active development with regular updates for new MLPerf Inference versions and models.
- Community support channels and detailed documentation are available through the MLCommons website.
Licensing & Compatibility
- The repository itself is typically licensed under Apache 2.0.
- Individual model implementations may have different licenses tied to their respective frameworks or datasets.
- Compatibility for commercial use depends on the licenses of the underlying models, frameworks, and datasets used.
Limitations & Caveats
- Reproducing specific historical benchmark results may require checking out specific Git tags or branches, as indicated in the README.
- Power submissions require special access to SPEC PTD.
- Some older model implementations might have tentative framework support indicated by question marks.