Discover and explore top open-source AI tools and projects—updated daily.
florianmattanaReverse engineering NVIDIA SASS for performance analysis
Top 96.5% on SourcePulse
Summary
SASS King addresses the critical gap in understanding modern NVIDIA GPU SASS (native instruction set) following significant architectural changes. It provides kernel engineers and researchers with a structured knowledge base and reverse-engineering methodology to analyze SASS dumps, identify compiler patterns, and link binary structures to source-level optimizations, enhancing transparency for performance analysis on newer NVIDIA architectures.
How It Works
The project employs a systematic methodology combining controlled micro-kernels with production-kernel audits. By isolating compiler decisions through single-parameter variations, it generates detailed SASS evidence. This data populates a knowledge base and a formal pattern library of reusable audit signatures. The approach uses strict claim tagging ([OBS], [INF], [HYP], [RES], [GAP]) for evidence traceability and prioritizes pattern-based audits over purely bit-level disassembly.
Quick Start & Requirements
pip, Docker) are not specified. Users engage with the corpus, knowledge base, and pattern library.nvcc, cuobjdump), Nsight Compute, and external tools like gpuasm.com and redplait/denvdis. Initial focus is SM120 / SM120a (consumer Blackwell).docs/START_HERE.md and patterns/README.md.Highlighted Details
Maintenance & Community
Authored by Florian Mattana. Contributions are welcomed via CONTRIBUTING.md. No specific community channels or sponsorship details are mentioned.
Licensing & Compatibility
The license type is not explicitly stated in the provided README text. Compatibility is focused on NVIDIA GPU architectures.
Limitations & Caveats
SASS ISA coverage is not complete; runtime layout decoding, full control-code bit placement, and cross-architecture replay are identified as future work. The project is actively developing (Phase 4: Production Audits is the next major step), indicating it is not yet a fully mature, stable toolset for all use cases.
1 week ago
Inactive
meta-pytorch
gpu-mode