Foundry-Local  by microsoft

Local inference runtime for generative AI models

Created 5 months ago
604 stars

Top 54.2% on SourcePulse

GitHubView on GitHub
Project Summary

Foundry Local enables on-device execution of generative AI models, targeting developers and power users who need to run AI locally without cloud dependencies. It offers enhanced privacy, reduced latency, and offline capabilities by leveraging ONNX Runtime and hardware acceleration, providing an OpenAI-compatible API for seamless integration.

How It Works

Foundry Local utilizes ONNX Runtime for optimized inference, automatically selecting and downloading model variants best suited for the user's hardware (CPU, GPU, NPU). This approach ensures high performance and efficient resource utilization. The project exposes an OpenAI-compatible API, allowing existing applications and workflows to interact with local models using familiar SDKs and REST calls.

Quick Start & Requirements

  • Windows: winget install Microsoft.FoundryLocal
  • macOS: brew tap microsoft/foundrylocal && brew install foundrylocal
  • Models: foundry model run phi-3.5-mini (automatically downloads optimal variant)
  • SDKs: pip install foundry-local-sdk openai (Python), npm install foundry-local-sdk openai (JavaScript)
  • Documentation: Foundry Local Download

Highlighted Details

  • On-device inference for privacy and reduced latency.
  • OpenAI-compatible API for easy integration.
  • Optimized performance via ONNX Runtime and hardware acceleration.
  • Supports automatic model variant selection based on hardware.

Maintenance & Community

  • Actively seeking feedback during preview phase.
  • Issues and suggestions can be reported via GitHub Issues.
  • Discord community available.

Licensing & Compatibility

Limitations & Caveats

The project is currently in a preview phase, indicating potential for breaking changes and ongoing development. Specific hardware acceleration support details beyond general mentions of GPU/NPU are not detailed in the README.

Health Check
Last Commit

6 days ago

Responsiveness

1 week

Pull Requests (30d)
9
Issues (30d)
13
Star History
29 stars in the last 30 days

Explore Similar Projects

Starred by Anton Bukov Anton Bukov(Cofounder of 1inch Network), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

exo by exo-explore

0.4%
31k
AI cluster for running models on diverse devices
Created 1 year ago
Updated 6 months ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
24 more.

open-webui by open-webui

0.6%
110k
Self-hosted AI platform for local LLM deployment
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.