FlagPerf by flagos-ai

AI chip benchmark platform for evaluating hardware/software stacks

Created 3 years ago

362 stars

Top 77.9% on SourcePulse

Project Summary

FlagPerf is an open-source platform for benchmarking AI hardware, targeting AI researchers and hardware vendors. It establishes an industry-practice-oriented metric system to evaluate AI hardware's actual capabilities across various software stack combinations (model + framework + compiler), offering a more comprehensive assessment beyond just execution time.

How It Works

FlagPerf employs a multi-dimensional evaluation metric system that includes functional correctness, performance metrics, resource utilization, and ecosystem adaptability. It supports a wide range of AI hardware by integrating with diverse training frameworks (PyTorch, TensorFlow, PaddlePaddle, MindSpore) and inference engines (TensorRT, XTCL, IxRT). The platform is designed for flexibility, allowing for testing across single-card, single-node, and multi-node environments to simulate real-world application scenarios.

Quick Start & Requirements

Installation: Clone the repository and navigate to the relevant sub-directory (base, training, or inference).
Prerequisites: Python, Docker (optional but recommended), hardware drivers, network configuration, SSH trust between servers, and monitoring tools (sysstat, ipmitool). Specific model datasets and checkpoints are required for training and inference benchmarks.
Setup: Requires configuring host files (configs/host.yaml) and model-specific configurations. Detailed setup instructions and documentation links are available within the README.

Highlighted Details

Covers over 30 classic models across CV, NLP, Speech, and Multimodal domains, with 80+ training examples and various inference scenarios.
Actively integrates with major AI hardware vendors and framework developers for broad ecosystem support.
Ensures fair evaluation by restricting vendor optimizations to hardware-execution-related aspects like distributed communication and batch size.
All testing code is open-source and processes are reproducible.

Maintenance & Community

FlagPerf is a collaborative effort involving Zhipu AI and numerous AI hardware and framework teams. Recent updates include support for operator evaluation, containerized execution, and specific model pre-training (LLaMA3, Megatron-Llama). Contact is available via email at flagperf@baai.ac.cn or through GitHub issues.

Licensing & Compatibility

The project is licensed under the Apache 2.0 license. Compatibility for commercial use or closed-source linking is generally permissive due to the Apache 2.0 license, but users should consult the documentation for specific model test case licensing.

Limitations & Caveats

Currently, FlagPerf focuses on offline batch processing and does not support cluster-level or client-side performance evaluation. The README notes that for transformer decoder models, parameter FLOPs calculation requires specific input length parameters.

FlagPerf by flagos-ai

Explore Similar Projects

mHC.cu by AndreSlavescu

DLPerf by Oneflow-Inc

gpu-perf-engineering-resources by wafer-ai

examples by graphcore

xpu-perf by bytedance

RyzenAI-SW by amd

transformers-benchmarks by mli

ai-performance-engineering by cfregly

DeepBench by baidu-research

inference by mlcommons

openvino by openvinotoolkit

DeepLearningExamples by NVIDIA