Discover and explore top open-source AI tools and projects—updated daily.
Training toolkit for LLMs & VLMs using Megatron
Top 29.9% on SourcePulse
This repository provides Pai-Megatron-Patch, a toolkit for efficiently training and predicting Large Language Models (LLMs) and Vision-Language Models (VLMs) using the Megatron framework. It targets developers seeking to optimize GPU utilization for large-scale models, offering accelerated training techniques and broad model compatibility.
How It Works
Pai-Megatron-Patch applies a "patch" philosophy, extending Megatron-LM's capabilities without invasive source code modifications. This approach ensures compatibility with Megatron-LM upgrades. It includes a model library with implementations of popular LLMs, bidirectional weight converters for Huggingface and Megatron formats, and supports FP8 acceleration via Flash Attention 2.0 and Transformer Engine.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 day ago
1 week