Discover and explore top open-source AI tools and projects—updated daily.
spark-arenaLLM inference management for NVIDIA DGX Spark systems
Top 96.8% on SourcePulse
This project provides sparkrun, a command-line tool designed to simplify the launching, management, and stopping of Large Language Model (LLM) inference workloads specifically on NVIDIA DGX Spark systems. It aims to eliminate the complexity associated with traditional cluster management tools like Slurm or Kubernetes, offering a streamlined experience for users focused on running LLM inference.
How It Works
sparkrun employs a unified command-line interface to manage LLM workloads across one or more DGX Spark nodes. Its core approach leverages multi-runtime support, integrating seamlessly with popular inference engines such as vLLM, SGLang, and llama.cpp. The tool facilitates multi-node tensor parallelism, automatically detecting and configuring InfiniBand/RDMA networking for efficient distributed inference. Workloads are defined using a Git-based recipe registry system, allowing users to easily access, share, and manage model configurations, including official, community, and custom benchmarks. A guided setup wizard automates the initial cluster configuration, including SSH mesh setup, network detection, and resource management daemon configuration.
Quick Start & Requirements
uvx sparkrun setup installs sparkrun and launches a guided setup wizard.Highlighted Details
sparkrun show <recipe>).Maintenance & Community
The project is sponsored and appears to be actively maintained, with official recipes hosted on GitHub. The Spark Arena platform serves as a community hub for sharing recipes and benchmark results.
Licensing & Compatibility
Licensed under the Apache License 2.0. This license is permissive and generally compatible with commercial use and linking within closed-source projects.
Limitations & Caveats
This tool is specifically designed for NVIDIA DGX Spark hardware and infrastructure. It deliberately abstracts away standard cluster schedulers like Slurm and container orchestrators like Kubernetes, which may be a limitation for environments not utilizing DGX Spark systems or requiring finer-grained control offered by those tools.
2 days ago
Inactive
sgl-project
b4rtaz
predibase
ai-dynamo