data-science-stack  by NVIDIA

NVIDIA Data Science Stack: tool for GPU-accelerated data science setup

created 5 years ago
389 stars

Top 74.9% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides the NVIDIA Data Science Stack, a tool designed to simplify the setup and management of GPU-accelerated data science environments on workstations and cloud VMs. It targets data scientists and researchers seeking a streamlined way to deploy and manage their development stacks, offering both containerized and local Conda environment options.

How It Works

The stack utilizes a shell script to automate system configuration, including NVIDIA driver installation and SELinux policy setup for containerized GPU access. Users can then choose to build and run Jupyter environments within Docker containers or local Conda environments. The script manages dependencies and provides commands for building, running, purging, and upgrading these environments, abstracting away much of the complexity of manual setup.

Quick Start & Requirements

  • Install/Run: Clone the repository and execute ./data-science-stack setup-system.
  • Prerequisites: Ubuntu 18.04/20.04 or RHEL 7.5+/8.x, NVIDIA GPU (Pascal or newer), NVIDIA GPU Driver (>= 460.39). WSL v2 support is alpha and container-only.
  • Setup Time: Not explicitly stated, but system setup and environment builds can be time-consuming due to large downloads and installations.
  • Docs: https://github.com/NVIDIA/data-science-stack

Highlighted Details

  • Supports both containerized (Docker) and local Conda environments for flexibility.
  • Automates NVIDIA driver installation and system configuration on supported Linux distributions.
  • Includes optional tools like jupyter-repo2docker, NVIDIA GPU Cloud CLI, Kaggle CLI, and AWS CLI.
  • Provides mechanisms for managing multiple users and upgrading existing installations.

Maintenance & Community

The project is maintained by NVIDIA. Issue tracking and release planning are available via GitHub Projects and Issues. Users can subscribe to release notifications by watching the repository.

Licensing & Compatibility

The repository itself is licensed under the Apache 2.0 license. However, it installs and configures NVIDIA drivers and software, which have their own licensing terms. Compatibility with commercial or closed-source applications depends on the underlying NVIDIA software licenses.

Limitations & Caveats

  • WSL v2 support is alpha and container-only.
  • RHEL 7.x driver installation is complex and requires manual intervention.
  • SELinux policy setup is specific to DGX servers and may require customization for other systems.
  • Laptop power management configuration is detailed and may require manual Xorg configuration.
Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Starred by Didier Lopes Didier Lopes(Founder of OpenBB), John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), and
1 more.

launchables by brevdev

0.2%
2k
Notebook templates for AI/ML tasks
created 1 year ago
updated 1 month ago
Feedback? Help us improve.