data-science-stack  by NVIDIA

NVIDIA Data Science Stack: tool for GPU-accelerated data science setup

Created 5 years ago
392 stars

Top 73.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides the NVIDIA Data Science Stack, a tool designed to simplify the setup and management of GPU-accelerated data science environments on workstations and cloud VMs. It targets data scientists and researchers seeking a streamlined way to deploy and manage their development stacks, offering both containerized and local Conda environment options.

How It Works

The stack utilizes a shell script to automate system configuration, including NVIDIA driver installation and SELinux policy setup for containerized GPU access. Users can then choose to build and run Jupyter environments within Docker containers or local Conda environments. The script manages dependencies and provides commands for building, running, purging, and upgrading these environments, abstracting away much of the complexity of manual setup.

Quick Start & Requirements

  • Install/Run: Clone the repository and execute ./data-science-stack setup-system.
  • Prerequisites: Ubuntu 18.04/20.04 or RHEL 7.5+/8.x, NVIDIA GPU (Pascal or newer), NVIDIA GPU Driver (>= 460.39). WSL v2 support is alpha and container-only.
  • Setup Time: Not explicitly stated, but system setup and environment builds can be time-consuming due to large downloads and installations.
  • Docs: https://github.com/NVIDIA/data-science-stack

Highlighted Details

  • Supports both containerized (Docker) and local Conda environments for flexibility.
  • Automates NVIDIA driver installation and system configuration on supported Linux distributions.
  • Includes optional tools like jupyter-repo2docker, NVIDIA GPU Cloud CLI, Kaggle CLI, and AWS CLI.
  • Provides mechanisms for managing multiple users and upgrading existing installations.

Maintenance & Community

The project is maintained by NVIDIA. Issue tracking and release planning are available via GitHub Projects and Issues. Users can subscribe to release notifications by watching the repository.

Licensing & Compatibility

The repository itself is licensed under the Apache 2.0 license. However, it installs and configures NVIDIA drivers and software, which have their own licensing terms. Compatibility with commercial or closed-source applications depends on the underlying NVIDIA software licenses.

Limitations & Caveats

  • WSL v2 support is alpha and container-only.
  • RHEL 7.x driver installation is complex and requires manual intervention.
  • SELinux policy setup is specific to DGX servers and may require customization for other systems.
  • Laptop power management configuration is detailed and may require manual Xorg configuration.
Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
3 more.

gpu.cpp by AnswerDotAI

0%
4k
C++ library for portable GPU computation using WebGPU
Created 1 year ago
Updated 2 months ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
2 more.

gpustack by gpustack

1.3%
4k
GPU cluster manager for AI model deployment
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.