ROCmLibs-for-gfx1103-AMD780M-APU  by likelovewant

ROCm library for boosting AMD GPU performance on Windows

Created 1 year ago
608 stars

Top 54.0% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides optimized ROCm libraries for various AMD GPU architectures, primarily targeting Windows users and enabling AI workloads (like Llama, Stable Diffusion) via the ZLUDA CUDA wrapper. It aims to bridge the gap in official ROCm support for certain AMD GPUs, offering significant performance improvements over DirectML.

How It Works

The project compiles and distributes custom-built ROCm libraries, derived from official Linux versions with added optimizations. These libraries are designed to be drop-in replacements for existing ROCm components, enhancing performance for specific GPU architectures. The approach leverages community-driven builds and environment variable overrides (like HSA_OVERRIDE_GFX_VERSION) to enable compatibility on unsupported hardware.

Quick Start & Requirements

  • Install: Download the appropriate .zip or .7z archive from the releases page matching your HIP SDK version. Backup existing rocblas.dll and the rocblas directory in your HIP SDK's bin folder. Extract the downloaded archive, placing the library files into the corresponding bin\rocblas directory and rocblas.dll into the bin directory of your HIP SDK installation.
  • Prerequisites: HIP SDK for Windows or ROCm for Linux. 7-Zip or WinRAR for extraction.
  • Resources: Specific .7z files are provided for HIP SDK versions 5.7.1, 6.1.2, and 6.2.4.
  • Links: Wiki for detailed instructions: https://github.com/likelovewant/ROCmLibs-for-gfx1103-AMD780M-APU/wiki

Highlighted Details

  • Claims 2-3x performance gains over DirectML in applications like Ollama, llama.cpp, and Stable Diffusion UIs.
  • Supports a wide range of AMD GPU architectures, including gfx803, gfx90x, gfx10xx, gfx1103, and gfx1150 (experimental).
  • Provides custom logic files for building rocBLAS, detailed in the wiki.
  • Offers specific builds for various HIP SDK versions on Windows.

Maintenance & Community

  • The project actively provides updated libraries for newer HIP SDK versions.
  • Users are directed to create issues only for repository-specific topics, not for applications like Ollama.

Licensing & Compatibility

  • The README mentions compliance with licenses if needed, but does not explicitly state a license for the repository's content. The source code is based on official ROCm, which has its own licensing. Compatibility for commercial use is not explicitly detailed.

Limitations & Caveats

  • The project recommends against using official ROCm for Linux directly, suggesting alternative methods.
  • gfx1150 support is marked as experimental.
  • Users must ensure the downloaded library version matches their installed HIP SDK version.
Health Check
Last Commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
12
Star History
34 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
3 more.

gpu.cpp by AnswerDotAI

0%
4k
C++ library for portable GPU computation using WebGPU
Created 1 year ago
Updated 2 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Ying Sheng Ying Sheng(Coauthor of SGLang).

fastllm by ztxz16

0.4%
4k
High-performance C++ LLM inference library
Created 2 years ago
Updated 1 week ago
Feedback? Help us improve.