DALI  by NVIDIA

GPU-accelerated library for data pre-processing in deep learning

Created 7 years ago
5,509 stars

Top 9.2% on SourcePulse

GitHubView on GitHub
Project Summary

NVIDIA DALI is a GPU-accelerated library designed to eliminate data loading and pre-processing bottlenecks in deep learning workflows. It offers a collection of optimized building blocks for image, video, and audio data, enabling users to create portable, high-throughput data pipelines that can be seamlessly integrated with popular frameworks like TensorFlow, PyTorch, and PaddlePaddle.

How It Works

DALI addresses CPU-bound data processing by offloading operations to the GPU. It utilizes a custom execution engine optimized for throughput, featuring prefetching, parallel execution, and batch processing. This GPU-centric approach, combined with a flexible, functional Python API, allows for the creation of complex, multi-stage data augmentation and transformation pipelines that run efficiently, directly feeding data to the GPU for training or inference.

Quick Start & Requirements

  • Install with pip install nvidia-dali-cuda120 or pip install --extra-index-url https://pypi.nvidia.com --upgrade nvidia-dali-cuda120.
  • Requires NVIDIA driver supporting the target CUDA version (e.g., CUDA 12.x) and the corresponding CUDA Toolkit.
  • Pre-installed in NGC containers for TensorFlow, PyTorch, and PaddlePaddle.
  • Official Documentation: https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html

Highlighted Details

  • Supports numerous data formats including LMDB, TFRecord, COCO, JPEG, WAV, FLAC, and video codecs (H.264, VP9, HEVC).
  • Accelerates common workloads like ResNet-50, SSD, and ASR models (Jasper, RNN-T).
  • Enables direct data transfer via GPUDirect Storage and integrates with NVIDIA Triton Inference Server.
  • Offers custom operator extensibility for user-specific needs.

Maintenance & Community

Licensing & Compatibility

  • Licensed under Apache 2.0, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

  • Requires NVIDIA GPU hardware and specific CUDA toolkit versions for installation and operation. Building from source requires a compilation guide.
Health Check
Last Commit

16 hours ago

Responsiveness

1 day

Pull Requests (30d)
32
Issues (30d)
5
Star History
31 stars in the last 30 days

Explore Similar Projects

Starred by Tri Dao Tri Dao(Chief Scientist at Together AI), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
1 more.

oslo by tunib-ai

0%
309
Framework for large-scale transformer optimization
Created 3 years ago
Updated 3 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
3 more.

LitServe by Lightning-AI

0.3%
4k
AI inference pipeline framework
Created 1 year ago
Updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Ying Sheng Ying Sheng(Coauthor of SGLang).

fastllm by ztxz16

0.4%
4k
High-performance C++ LLM inference library
Created 2 years ago
Updated 1 week ago
Starred by Clement Delangue Clement Delangue(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
26 more.

datasets by huggingface

0.1%
21k
Access and process large AI datasets efficiently
Created 5 years ago
Updated 1 day ago
Feedback? Help us improve.