Automodel by NVIDIA-NeMo

PyTorch-native SPMD library for LLM/VLM training

Created 9 months ago

362 stars

Top 77.9% on SourcePulse

View on GitHub

2 Experts Love This Project

Jiaming Song

Chief Scientist at Luma AI

Jeff Hammerbacher

Cofounder of Cloudera

Project Summary

NeMo AutoModel is an open-source PyTorch training library designed to streamline and scale the training and fine-tuning of Large Language Models (LLMs) and Vision-Language Models (VLMs). It targets researchers and engineers, enabling rapid experimentation from small-scale setups to massive multi-GPU, multi-node deployments. The library offers flexibility, reproducibility, and high performance with minimal ceremony, featuring seamless integration with Hugging Face models.

How It Works

The core innovation lies in its PyTorch Distributed native SPMD (Single Program Multiple Data) approach, leveraging DTensor for parallelism. This "one program, any scale" philosophy allows a single training script to run across varying hardware configurations by simply adjusting the distributed mesh. Parallelism strategies (tensor, sequence, data) are defined in configuration files rather than requiring model code rewrites, decoupling model logic from parallel execution. This composable and portable design simplifies scaling up, changing strategies, and reasoning about failure modes.

Quick Start & Requirements

Installation: Use uv for environment management: uv venv, uv sync --frozen --all-extras, then uv pip install nemo_automodel or uv pip install git+https://github.com/NVIDIA-NeMo/Automodel.git.
Prerequisites: Python 3.10+. Implicitly requires NVIDIA GPUs and CUDA for distributed training.
Documentation: https://docs.nvidia.com/nemo/automodel/latest/index.html
Examples: https://github.com/NVIDIA-NeMo/Automodel/tree/main/examples

Highlighted Details

SPMD Parallelism: Configuration-driven parallelism (FSDP2, TP, CP, SP, Pipeline, HSDP) without model code modification.
Hugging Face Integration: Native support for a vast array of LLMs and VLMs from the Hugging Face Hub.
Training Capabilities: Supports LLM/VLM pre-training, Supervised Fine-Tuning (SFT), Parameter-Efficient Fine-Tuning (PEFT), and Knowledge Distillation.
Performance: Demonstrates high training throughput on NVIDIA GPUs, with optimizations like FP8 support and sequence packing.
Interoperability: Integrates with NeMo RL, Hugging Face, and offers Megatron Bridge conversions.

Maintenance & Community

The project is under active development, with regular updates and a roadmap towards a stable release. Contributions are welcomed via the provided contributing guide. Recent news highlights new model support and technical advancements. No explicit community channels (e.g., Discord, Slack) are listed.

Licensing & Compatibility

Licensed under the Apache License 2.0, which is permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

NeMo AutoModel is actively under development, and users should expect the interface to evolve as the project moves towards a stable release. New features and improvements are continuously being added.

Health Check

Last Commit

21 hours ago

Responsiveness

Inactive

Pull Requests (30d)

253

Issues (30d)

Star History

72 stars in the last 30 days