HeCBench by zjin-lcf

Heterogeneous computing benchmarks for performance and portability

Created 5 years ago

271 stars

Top 95.1% on SourcePulse

Project Summary

HeCBench is a comprehensive suite of heterogeneous computing benchmarks designed for evaluating performance, portability, and productivity across CUDA, HIP, SYCL/DPC++, and OpenMP target offloading. It targets researchers, developers, and power users working with diverse hardware accelerators. The suite aims to provide a standardized way to measure and compare the efficiency of parallel code across different programming models and architectures.

How It Works

HeCBench organizes benchmarks into categories like Automotive, Bioinformatics, Computer Vision, Cryptography, and Machine Learning. Each benchmark is implemented in multiple parallel programming models, allowing for direct comparison of performance and portability. The project provides both Makefile-based execution for individual benchmarks and Python scripts for automated building, running, and result aggregation, simplifying the benchmarking process.

Quick Start & Requirements

Installation: Clone the repository and use make commands within benchmark directories or utilize the autohecbench.py script.
Dependencies: Requires specific compilers and toolkits depending on the target architecture: AMD ROCm for HIP, Intel DPC++ compiler or oneAPI toolkit for SYCL, and NVIDIA HPC SDK for CUDA. Some SYCL benchmarks may require additional oneAPI components like oneDPL, oneTBB, Syclomatic, or oneMKL.
Resources: Building and running benchmarks may require significant compilation time and computational resources, especially for larger datasets or complex benchmarks.
Documentation: Reference and README files within benchmark directories provide detailed information.

Highlighted Details

Supports CUDA, HIP, SYCL/DPC++, and OpenMP 4.5 target offloading.
Features a broad range of benchmarks across numerous scientific and engineering domains.
Includes Python scripts for automated execution and comparison of benchmark results.
Provides detailed categorization and descriptions for each benchmark.

Maintenance & Community

The project is authored and maintained by Zheming Jin. Contributions from Codeplay and Intel are acknowledged, particularly regarding the oneAPI ecosystem. The project utilizes resources from Intel DevCloud, Chameleon testbed, Argonne Leadership Computing Facility, and Oak Ridge National Laboratory.

Licensing & Compatibility

HeCBench is primarily licensed under BSD-3. However, several benchmarks (ace, ans, bitcracker, bm3d, bmf, bspline-vgh, car, ccs, che, contract, diamond, feynman-kac, lebesgue) have GPL-style licenses. This mix of licenses may impose restrictions on commercial use or linking with closed-source projects.

Limitations & Caveats

The benchmarks have not been evaluated on Windows or macOS. Some SYCL programs may require the latest Intel SYCL compiler. Kernel results might not exactly match across different programming models for certain programs, and not all benchmarks include automated host/device result verification. Not all CUDA programs have SYCL, HIP, or OpenMP equivalents, and not all programs have OpenMP target offloading implementations. Some programs may have suboptimal raw performance or take a long time to complete on integrated GPUs.

HeCBench by zjin-lcf

Explore Similar Projects

ollama-benchmark by aidatatools

pti-gpu by intel

xpu-perf by bytedance

KernelBench by ScalingIntelligence

transformers-benchmarks by mli

cpufp by pigirons

efficient-dl-systems by mryab

bolt by huawei-noah

DeepBench by baidu-research

scalene by plasma-umass

CTranslate2 by OpenNMT

OpenBLAS by OpenMathLib