Triton kernels for efficient LLM training
Top 9.4% on sourcepulse
Liger Kernel provides a suite of optimized Triton kernels designed to significantly enhance the efficiency of Large Language Model (LLM) training. Targeting researchers and engineers working with LLMs, it offers substantial improvements in training throughput and memory usage, enabling larger models and longer context lengths.
How It Works
Liger Kernel leverages Triton's capabilities for low-level GPU programming to fuse common LLM operations like RMSNorm, RoPE, SwiGLU, and various loss functions. This fusion, combined with techniques like in-place computation and chunking, reduces memory bandwidth requirements and computational overhead. The kernels are designed for exact computation, ensuring no loss of accuracy compared to standard implementations.
Quick Start & Requirements
pip install liger-kernel
(stable) or pip install liger-kernel-nightly
(nightly). Install from source via git clone
and pip install -e .
.transformers
(>= 4.x) is required for patching APIs.Highlighted Details
Maintenance & Community
Actively developed by LinkedIn, with significant community contributions (50+ PRs, 10+ contributors). Supported by NVIDIA, AMD, and Intel for GPU resources. Integrations with Hugging Face, Lightning AI, Axolotl, and Llama-Factory. Discord channel available for discussion.
Licensing & Compatibility
The project is licensed under the MIT License, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
While generally stable, some kernels are marked as experimental. Compatibility with specific model architectures not explicitly listed in the high-level APIs may require manual integration using low-level APIs.
1 day ago
1 day