Discover and explore top open-source AI tools and projects—updated daily.
Enhance LLM inference determinism
New!
Top 52.1% on SourcePulse
Batch Invariant Ops addresses non-determinism in LLM inference, particularly within PyTorch. It provides batch-invariant kernels to ensure consistent outputs across different batch sizes, benefiting researchers and engineers seeking reproducible machine learning results. The library offers a low-overhead, non-intrusive method to enhance the determinism of existing PyTorch models.
How It Works
The library leverages PyTorch's torch.Library
mechanism to substitute standard PyTorch kernels with custom, batch-invariant implementations. This approach allows for seamless integration into existing PyTorch workflows, requiring minimal code modifications. By replacing operations like matrix multiplication and softmax with deterministic variants, it eliminates sources of numerical instability that can arise from varying batch processing orders or internal optimizations.
Quick Start & Requirements
pip install -e .
torch.set_default_device('cuda')
in examples), PyTorch.Highlighted Details
torch.mm
, torch.addmm
, torch.log_softmax
, torch.mean
.Maintenance & Community
No specific details on maintainers, community channels (Discord/Slack), or roadmap are provided in the README.
Licensing & Compatibility
The README does not specify the license type or compatibility notes for commercial use.
Limitations & Caveats
The library currently supports a limited set of PyTorch operations. Its effectiveness and integration may depend on the specific model architecture and PyTorch version used. The vLLM example requires an upstream PR for full integration.
1 week ago
Inactive