amd-strix-halo-toolboxes  by kyuz0

LLM inference toolboxes for AMD Ryzen AI Max

Created 5 months ago
747 stars

Top 46.6% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project provides pre-built containerized environments ("toolboxes") for running Large Language Models (LLMs) on AMD Ryzen AI Max “Strix Halo” integrated GPUs. It targets engineers and power users seeking a reproducible, flexible way to leverage AMD hardware for LLM inference using Llama.cpp across various compute backends.

How It Works

The project uses Toolbx containers for isolated LLM inference, powered by Llama.cpp. It supports multiple AMD backends: Vulkan (RADV, AMDVLK) and ROCm. This offers flexibility in choosing between stability, performance, or newer ROCm features, ensuring seamless integration and host system cleanliness. Containers automatically update with Llama.cpp changes.

Quick Start & Requirements

  • Installation: Create toolboxes with toolbox create (e.g., `docker.io/kyuz0
Health Check
Last Commit

14 hours ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
8
Star History
116 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Johannes Hagemann Johannes Hagemann(Cofounder of Prime Intellect), and
4 more.

S-LoRA by S-LoRA

0.1%
2k
System for scalable LoRA adapter serving
Created 2 years ago
Updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Ying Sheng Ying Sheng(Coauthor of SGLang).

fastllm by ztxz16

0.1%
4k
High-performance C++ LLM inference library
Created 2 years ago
Updated 1 month ago
Feedback? Help us improve.