hipfire  by Kaden-Schutt

RDNA-native LLM inference engine for AMD GPUs

Created 2 months ago
399 stars

Top 72.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

hipfire provides an RDNA-native LLM inference engine in Rust, targeting AMD GPUs, especially consumer hardware often overlooked by official ROCm support. It offers a unified, high-performance solution for developers and researchers seeking accelerated inference across the RDNA family without Python or complex ROCm stacks.

How It Works

This project leverages Rust and HIP for RDNA-specific LLM inference, aiming for a single binary that ships pre-compiled kernels and JIT-compiles others. By avoiding Python, PyTorch, and the ROCm userspace stack at runtime, hipfire simplifies dependencies and optimizes performance for the entire RDNA GPU spectrum (RDNA1-RDNA4, consumer, pro, APU).

Quick Start & Requirements

For Linux with ROCm 6+, install via: curl -L https://raw.githubusercontent.com/Kaden-Schutt/hipfire/master/scripts/install.sh | bash. Windows/source builds and verification details are in docs/GETTING_STARTED.md.

Highlighted Details

  • Performance: Benchmarks on a 7900 XTX show significant decode/prefill speedups over ollama (e.g., 1.71x decode for Qwen 3.5 9B).
  • DFlash: Implements DFlash speculative decode for further gains (up to 4.45x speedup on HumanEval/53 for 27B model), with genre-conditional performance detailed per-architecture.
  • API: Offers an OpenAI-compatible API via hipfire serve.
  • RDNA Support: Explicitly targets RDNA1-RDNA4 GPUs (consumer, pro, APU).

Maintenance & Community

The project is at v0.1.8-alpha.2, indicating early development. A CHANGELOG.md is available. Correctness is emphasized via scripts like ./scripts/coherence-gate-dflash.sh and detailed benchmarking methodology. No community channels or sponsorships are listed.

Licensing & Compatibility

Distributed under the MIT license, which is permissive for commercial use and closed-source integration.

Limitations & Caveats

As an alpha release, hipfire may have bugs or incomplete features. It is exclusively focused on AMD RDNA GPUs and requires ROCm 6+ on Linux.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
181
Issues (30d)
75
Star History
205 stars in the last 30 days

Explore Similar Projects

Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
12 more.

mistral.rs by EricLBuehler

0.3%
7k
LLM inference engine for blazing fast performance
Created 2 years ago
Updated 3 days ago
Feedback? Help us improve.