DALI  by NVIDIA

GPU-accelerated library for data pre-processing in deep learning

created 7 years ago
5,476 stars

Top 9.4% on sourcepulse

GitHubView on GitHub
Project Summary

NVIDIA DALI is a GPU-accelerated library designed to eliminate data loading and pre-processing bottlenecks in deep learning workflows. It offers a collection of optimized building blocks for image, video, and audio data, enabling users to create portable, high-throughput data pipelines that can be seamlessly integrated with popular frameworks like TensorFlow, PyTorch, and PaddlePaddle.

How It Works

DALI addresses CPU-bound data processing by offloading operations to the GPU. It utilizes a custom execution engine optimized for throughput, featuring prefetching, parallel execution, and batch processing. This GPU-centric approach, combined with a flexible, functional Python API, allows for the creation of complex, multi-stage data augmentation and transformation pipelines that run efficiently, directly feeding data to the GPU for training or inference.

Quick Start & Requirements

  • Install with pip install nvidia-dali-cuda120 or pip install --extra-index-url https://pypi.nvidia.com --upgrade nvidia-dali-cuda120.
  • Requires NVIDIA driver supporting the target CUDA version (e.g., CUDA 12.x) and the corresponding CUDA Toolkit.
  • Pre-installed in NGC containers for TensorFlow, PyTorch, and PaddlePaddle.
  • Official Documentation: https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html

Highlighted Details

  • Supports numerous data formats including LMDB, TFRecord, COCO, JPEG, WAV, FLAC, and video codecs (H.264, VP9, HEVC).
  • Accelerates common workloads like ResNet-50, SSD, and ASR models (Jasper, RNN-T).
  • Enables direct data transfer via GPUDirect Storage and integrates with NVIDIA Triton Inference Server.
  • Offers custom operator extensibility for user-specific needs.

Maintenance & Community

Licensing & Compatibility

  • Licensed under Apache 2.0, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

  • Requires NVIDIA GPU hardware and specific CUDA toolkit versions for installation and operation. Building from source requires a compilation guide.
Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
16
Issues (30d)
0
Star History
109 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

InternEvo by InternLM

1.0%
402
Lightweight training framework for model pre-training
created 1 year ago
updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
5 more.

TensorRT-LLM by NVIDIA

0.6%
11k
LLM inference optimization SDK for NVIDIA GPUs
created 1 year ago
updated 22 hours ago
Feedback? Help us improve.