Discover and explore top open-source AI tools and projects—updated daily.
Heterogeneous computing benchmarks for performance and portability
Top 97.3% on SourcePulse
HeCBench is a comprehensive suite of heterogeneous computing benchmarks designed for evaluating performance, portability, and productivity across CUDA, HIP, SYCL/DPC++, and OpenMP target offloading. It targets researchers, developers, and power users working with diverse hardware accelerators. The suite aims to provide a standardized way to measure and compare the efficiency of parallel code across different programming models and architectures.
How It Works
HeCBench organizes benchmarks into categories like Automotive, Bioinformatics, Computer Vision, Cryptography, and Machine Learning. Each benchmark is implemented in multiple parallel programming models, allowing for direct comparison of performance and portability. The project provides both Makefile-based execution for individual benchmarks and Python scripts for automated building, running, and result aggregation, simplifying the benchmarking process.
Quick Start & Requirements
make
commands within benchmark directories or utilize the autohecbench.py
script.Highlighted Details
Maintenance & Community
The project is authored and maintained by Zheming Jin. Contributions from Codeplay and Intel are acknowledged, particularly regarding the oneAPI ecosystem. The project utilizes resources from Intel DevCloud, Chameleon testbed, Argonne Leadership Computing Facility, and Oak Ridge National Laboratory.
Licensing & Compatibility
HeCBench is primarily licensed under BSD-3. However, several benchmarks (ace, ans, bitcracker, bm3d, bmf, bspline-vgh, car, ccs, che, contract, diamond, feynman-kac, lebesgue) have GPL-style licenses. This mix of licenses may impose restrictions on commercial use or linking with closed-source projects.
Limitations & Caveats
The benchmarks have not been evaluated on Windows or macOS. Some SYCL programs may require the latest Intel SYCL compiler. Kernel results might not exactly match across different programming models for certain programs, and not all benchmarks include automated host/device result verification. Not all CUDA programs have SYCL, HIP, or OpenMP equivalents, and not all programs have OpenMP target offloading implementations. Some programs may have suboptimal raw performance or take a long time to complete on integrated GPUs.
1 week ago
Inactive