Discover and explore top open-source AI tools and projects—updated daily.
zolotukhinEfficient LLM inference for AMD GPUs and Apple Silicon
New!
Top 89.4% on SourcePulse
Summary
ZINC addresses the challenge of running local Large Language Models (LLMs) efficiently on consumer AMD GPUs and Apple Silicon, platforms often underserved by existing inference engines. It provides a single, self-contained binary solution for users seeking high performance without complex dependencies like ROCm or Python. The project targets engineers and power users who want to leverage their hardware for LLM inference.
How It Works
This project is built entirely in Zig, compiling to a single, dependency-free binary. It leverages Vulkan for AMD GPUs on Linux and Metal for Apple Silicon on macOS, employing hand-tuned shaders specifically optimized for each architecture's strengths. ZINC automatically selects the correct backend at build time, offering an OpenAI-compatible API and an integrated browser-based chat UI for ease of use.
Quick Start & Requirements
zig build -Doptimize=ReleaseFast. Run preflight checks with ./zig-out/bin/zinc --check. Model management commands include list, pull, use, rm. Execute inference via ./zig-out/bin/zinc --model-id <model_id> --prompt "..." or launch the chat UI with ./zig-out/bin/zinc chat.Highlighted Details
ReleaseFast build./v1) and a built-in browser chat interface.Maintenance & Community
Recent validation snapshots (e.g., 2026-03-31) indicate active development. The README mentions CONTRIBUTING.md and a Code of Conduct, suggesting established development practices, though explicit community links (Discord/Slack) are not provided.
Licensing & Compatibility
Limitations & Caveats
The project is explicitly marked as "Still Rough." Key features like continuous batching and multi-tenant serving are still under development. Performance tuning for Apple Silicon is ongoing, with the RDNA4 path being more mature. The list of supported GGUF models is intentionally narrow.
1 day ago
Inactive
exo-explore