Discover and explore top open-source AI tools and projects—updated daily.
PrismML-EngRun large language models locally with optimized backends
New!
Top 66.1% on SourcePulse
This project provides a streamlined demo and setup process for running Bonsai language models locally across diverse hardware. It targets engineers and researchers seeking an accessible way to deploy LLMs on Mac (Metal, Apple Silicon), Linux, and Windows (CUDA/CPU), offering a unified interface for model inference.
How It Works
The project integrates two primary inference backends: llama.cpp for broad cross-platform compatibility (GGUF format) and MLX for optimized performance on Apple Silicon (MLX format). Crucially, it utilizes custom forks of these projects (PrismML-Eng/llama.cpp, PrismML-Eng/mlx) to incorporate necessary inference kernels not yet available in their upstream versions, enabling immediate functionality.
Quick Start & Requirements
./setup.sh (macOS/Linux) or .\setup.ps1 (Windows).build-essential.uv and a virtual environment.PRISM_HF_TOKEN: Required for downloading models from private HuggingFace repositories.setup.sh/setup.ps1 script automates dependency installation, environment setup, model downloads, and binary acquisition/compilation, implying a comprehensive initial setup.Highlighted Details
llama.cpp) and MLX (for MLX) model formats.Maintenance & Community
The project maintains custom forks of llama.cpp and MLX, suggesting active development on these core components. Community support is available via Discord.
Licensing & Compatibility
The specific open-source license is not explicitly stated in the provided README, which is a critical omission for due diligence. The project is designed for local execution on macOS (Apple Silicon, Metal), Linux (CUDA, CPU), and Windows (CUDA).
Limitations & Caveats
Requires custom forks of llama.cpp and MLX due to missing upstream kernels. Model downloads necessitate a PRISM_HF_TOKEN, indicating reliance on private HuggingFace repositories. The comprehensive setup script may require careful monitoring on diverse system configurations.
2 days ago
Inactive
eastriverlee
pytorch
bentoml
mozilla-ai