Discover and explore top open-source AI tools and projects—updated daily.
invergent-aiHigh-performance AI model training and fine-tuning
Top 43.8% on SourcePulse
Surogate Trainer is a high-performance framework designed for rapid experimentation in training and fine-tuning large language models, targeting developers and enterprises. It offers significant speedups and VRAM efficiency through advanced quantization techniques like FP8 and FP4, alongside a native C++/CUDA engine, aiming to surpass existing training frameworks.
How It Works
Surogate leverages a native C++/CUDA engine for "Speed-Of-Light" (SOL) throughput, enabling advanced FP8 and FP4 (NVFP4) training and fine-tuning. Its core design includes smart CPU offloading for weights, gradients, and activations, which provides superior VRAM usage and performance compared to methods like QLoRA. The framework supports mixed-precision training and offers a Python DSL with Ahead-of-Time (AOT) auto-differentiation for integrating new model architectures.
Quick Start & Requirements
install.sh script (auto-detects CUDA), or building from source.https://docs.surogate.ai, and examples can be found at https://github.com/invergent-ai/surogate/tree/master/examples.Highlighted Details
Maintenance & Community
The project appears actively maintained, with support for recent hardware and models. Community interaction is primarily through their Twitter handle @surogate_ai. Contributions are welcomed via GitHub pull requests.
Licensing & Compatibility
The project is licensed under the Apache 2.0 license, which is generally permissive for commercial use and integration into closed-source projects.
Limitations & Caveats
Native FP4 training (NVFP4) is explicitly stated to require Blackwell+ GPUs (SM100+). Docker images are currently limited to the x86_64 architecture. Users requiring support for specific model architectures not yet listed are encouraged to contribute via pull requests.
2 days ago
Inactive
microsoft
InternLM
bghira
NervanaSystems
huggingface
Lightning-AI