Discover and explore top open-source AI tools and projects—updated daily.
Local inference runtime for generative AI models
Top 54.2% on SourcePulse
Foundry Local enables on-device execution of generative AI models, targeting developers and power users who need to run AI locally without cloud dependencies. It offers enhanced privacy, reduced latency, and offline capabilities by leveraging ONNX Runtime and hardware acceleration, providing an OpenAI-compatible API for seamless integration.
How It Works
Foundry Local utilizes ONNX Runtime for optimized inference, automatically selecting and downloading model variants best suited for the user's hardware (CPU, GPU, NPU). This approach ensures high performance and efficient resource utilization. The project exposes an OpenAI-compatible API, allowing existing applications and workflows to interact with local models using familiar SDKs and REST calls.
Quick Start & Requirements
winget install Microsoft.FoundryLocal
brew tap microsoft/foundrylocal && brew install foundrylocal
foundry model run phi-3.5-mini
(automatically downloads optimal variant)pip install foundry-local-sdk openai
(Python), npm install foundry-local-sdk openai
(JavaScript)Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is currently in a preview phase, indicating potential for breaking changes and ongoing development. Specific hardware acceleration support details beyond general mentions of GPU/NPU are not detailed in the README.
6 days ago
1 week