inference  by mlcommons

MLPerf Inference benchmark suite

created 6 years ago
1,424 stars

Top 29.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides reference implementations for MLPerf™ Inference benchmarks, enabling standardized performance measurement of machine learning models across various hardware and deployment scenarios. It targets researchers, developers, and hardware vendors seeking to evaluate and optimize ML inference performance.

How It Works

The suite offers a collection of optimized codebases for popular ML models, each tailored to specific frameworks (TensorFlow, PyTorch, ONNX, TVM, NCNN) and datasets. It standardizes the measurement process, ensuring fair comparisons by defining input data preprocessing, model execution, and output interpretation for each benchmark. This approach allows for consistent evaluation across diverse hardware platforms and software stacks.

Quick Start & Requirements

  • Installation and execution typically involve cloning the repository and following specific instructions within each model's directory.
  • Dependencies vary by model and framework, often including Python, specific ML frameworks (TensorFlow, PyTorch), and potentially CUDA for GPU acceleration.
  • Refer to the official MLPerf Inference documentation website for automated commands and detailed setup guides: https://www.mlcommons.org/en/inference-benchmarks/

Highlighted Details

  • Supports a wide range of models including ResNet50, BERT, Llama2/3, Mixtral, Stable Diffusion XL, and 3D-UNet.
  • Covers diverse application domains: vision, language, recommendation, medical imaging, text-to-image, and automotive.
  • Provides reference implementations across multiple frameworks like TensorFlow, PyTorch, ONNX, TVM, and NCNN.
  • Includes historical versions (v0.5 to v5.0) for tracking benchmark evolution and submission reproducibility.

Maintenance & Community

  • Managed by MLCommons, a consortium of industry and academic institutions.
  • Active development with regular updates for new MLPerf Inference versions and models.
  • Community support channels and detailed documentation are available through the MLCommons website.

Licensing & Compatibility

  • The repository itself is typically licensed under Apache 2.0.
  • Individual model implementations may have different licenses tied to their respective frameworks or datasets.
  • Compatibility for commercial use depends on the licenses of the underlying models, frameworks, and datasets used.

Limitations & Caveats

  • Reproducing specific historical benchmark results may require checking out specific Git tags or branches, as indicated in the README.
  • Power submissions require special access to SPEC PTD.
  • Some older model implementations might have tentative framework support indicated by question marks.
Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
54
Issues (30d)
29
Star History
63 stars in the last 90 days

Explore Similar Projects

Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
12 more.

DeepSpeed by deepspeedai

0.2%
40k
Deep learning optimization library for distributed training and inference
created 5 years ago
updated 1 day ago
Feedback? Help us improve.