Deep learning library using composable compilers for high performance
Top 22.1% on sourcepulse
Luminal is a Rust-based deep learning library designed for high-performance inference and training through a composable, ahead-of-time compilation approach. It targets developers seeking maximum efficiency on diverse hardware, from consumer CPUs and Apple Silicon to NVIDIA GPUs, by compiling computation graphs into optimized, native code.
How It Works
Luminal employs a compile-time-first philosophy, building static computation graphs from 11 primitive operations. This allows its compilers (e.g., CPUCompiler
, MetalCompiler
, CUDACompiler
) to perform aggressive optimizations like kernel fusion and shape-specific code generation, treating the entire network as a single unit. This contrasts with eager execution, aiming to push complexity into the compiler for superior performance and hardware-specific tuning without manual code divergence.
Quick Start & Requirements
cargo run --release --features <cuda|metal|cpu>
(after cd ./examples/llama
and ./setup/setup.sh
for Llama 3 example).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
19 hours ago
1 day