Discover and explore top open-source AI tools and projects—updated daily.
kkokosaNatively built .NET LLM inference engine
Top 71.8% on SourcePulse
Summary
dotLLM provides a high-performance LLM inference engine natively implemented in C#/.NET, targeting developers seeking efficient LLM integration. It bypasses Python/C++ wrappers, offering significant speedups via SIMD-optimized CPU and CUDA GPU backends, making advanced LLM capabilities accessible within the .NET ecosystem.
How It Works
This engine is built ground-up in pure C#, leveraging System.Runtime.Intrinsics for SIMD CPU operations and PTX kernels via the CUDA Driver API for GPU acceleration. It supports transformer models like Llama, Mistral, and Phi. Key design choices include zero-GC inference using unmanaged memory, memory-mapped GGUF loading for millisecond model startup, and a modular NuGet package structure. This native approach avoids the overhead of inter-process communication or foreign function interfaces.
Quick Start & Requirements
Installation options:
dotnet tool install -g DotLLM.Cli --prerelease.Requires .NET 10 runtime. GPU acceleration needs an NVIDIA GPU and CUDA Toolkit. Python 3.10+ is used for scripts. Website: https://dotllm.dev/ Documentation: docs/ Roadmap: docs/ROADMAP.md Discussions: https://github.com/kkokosa/dotLLM/discussions
Highlighted Details
NativeMemory.AlignedAlloc) for tensors, avoiding managed heap allocations.Maintenance & Community
Authored by .NET MVP Konrad Kokosa. Community engagement via GitHub Discussions. A detailed roadmap is available.
Licensing & Compatibility
Licensed under GNU General Public License v3.0 (GPL v3). This strong copyleft license may impact commercial use or integration into proprietary software.
Limitations & Caveats
Native AOT builds are experimental. Speculative decoding is currently greedy-only. GPL v3 license has copyleft implications. Continuous batching and advanced scheduling are planned for future releases.
6 days ago
Inactive
AI-Hypercomputer
b4rtaz
EricLBuehler
mozilla-ai