pti-gpu  by intel

Performance analysis toolkit for Intel GPUs

Created 5 years ago
256 stars

Top 98.5% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Intel's Profiling Tools Interfaces for GPU (PTI for GPU) repository offers a comprehensive set of documentation and tools designed to simplify performance analysis on Intel® Processor Graphics. It targets engineers and researchers working with Intel GPUs, providing the necessary interfaces and samples to easily collect and interpret performance data, thereby optimizing application efficiency.

How It Works

PTI for GPU facilitates performance analysis through a layered approach, supporting various profiling techniques. It enables Runtime API Tracing for OpenCL™ and oneAPI Level Zero, Device Activity Tracing, Binary/Source Correlation, and Metrics Collection via the Level Zero Metric API and Performance Monitoring registers. The project also integrates Binary Instrumentation and Code Annotation capabilities. Key tools include unitrace for unified tracing of hardware and software events, onetrace for host and device tracing, and oneprof for GPU hardware metrics collection.

Quick Start & Requirements

Primary installation requires CMake (>=3.12), Git (>=1.8), and Python (>=3.6). On Linux, users must be part of the video or render group to perform computations. Essential dependencies include the OpenCL™ ICD Loader and Headers, oneAPI Level Zero loader, Intel® Graphics Compute Runtime, and the Intel® Metrics Discovery Application Programming Interface. Building samples involves navigating to the specific sample directory, creating a build subdirectory, and executing cmake -DCMAKE_BUILD_TYPE=Release .. followed by make. Testing all samples can be initiated with python <pti_root>/tests/run.py.

Highlighted Details

  • Broad Profiling Scope: Covers API tracing (OpenCL™, Level Zero), device activity, binary/source correlation, hardware metrics collection, binary instrumentation, and system management.
  • Integrated Tool Suite: Features unitrace for unified tracing, onetrace for host/device tracing, oneprof for GPU HW metrics, and API-specific tracers (ze_tracer, cl_tracer).
  • Hardware Agnostic (within Intel): Supports Intel® Processor Graphics Gen9 and newer, including Iris® Xe and Data Center GPU Flex/Max Series.
  • Sample-Driven Analysis: Provides numerous sample tools for identifying hot functions/kernels, debugging, and collecting detailed performance metrics for different backends (OpenCL™, Level Zero, DPC++, OpenMP*).

Licensing & Compatibility

Samples are distributed under the permissive MIT License. No specific compatibility notes for commercial use or closed-source linking are provided, but the MIT license generally allows broad usage.

Limitations & Caveats

Windows support is currently under development. Known issues on RHEL include potential missing IGA libraries (requiring manual linking) and possible compiler version requirements. Non-root users may need to adjust kernel module parameters for metrics collection.

Health Check
Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
1
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
4 more.

gpu.cpp by AnswerDotAI

0.1%
4k
C++ library for portable GPU computation using WebGPU
Created 1 year ago
Updated 3 months ago
Feedback? Help us improve.