CPU tool for benchmarking peak floating-point performance
Top 58.4% on sourcepulse
This tool benchmarks peak floating-point and AI instruction set performance across various CPU architectures. It's designed for hardware engineers and performance analysts needing to evaluate CPU capabilities for AI and scientific computing workloads. The project offers automatic detection and compilation for local SIMD/DSA ISAs, simplifying cross-platform benchmarking.
How It Works
The tool leverages C++ and auto-vectorization to generate highly optimized code for specific SIMD (Single Instruction, Multiple Data) and DSA (Data Streaming Accelerator) instruction sets. It dynamically compiles kernels tailored to the detected CPU features, ensuring maximum utilization of available hardware capabilities like AVX, AVX512, AMX, NEON, and RISC-V Vector extensions for various data types (FP32, FP64, FP16, BF16, INT8).
Quick Start & Requirements
./build_x64.sh
./build_arm64.sh
./build_riscv64.sh
./build_loongarch64.sh
./build_e2k.sh
./cpufp --thread_pool=[xxx]
Highlighted Details
Maintenance & Community
The project is maintained by pigirons. No community links or roadmap are provided in the README.
Licensing & Compatibility
The README does not specify a license.
Limitations & Caveats
Windows support is explicitly stated as "no" for arm64, and not mentioned for other architectures. The "Todo list" indicates planned support for armv9 (SVE, SVE2 & SME). The e2k ISA section mentions "Unknown" for AVX_VNNI_INT8 feature.
3 weeks ago
1+ week