deep-learning-containers  by aws

Streamline deep learning workflows on AWS with pre-built Docker images

Created 6 years ago
1,129 stars

Top 34.0% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

AWS Deep Learning Containers (DLCs) provide pre-built, optimized Docker images for popular deep learning frameworks like TensorFlow, PyTorch, and MXNet. Designed for users deploying machine learning workloads on AWS, these containers simplify the setup and execution of training and inference tasks across services such as Amazon SageMaker, EC2, ECS, and EKS, offering a streamlined path to production.

How It Works

The project offers a set of Docker images pre-configured with deep learning frameworks, NVIDIA CUDA for GPU acceleration, and Intel MKL for CPU optimization. These images are hosted on Amazon ECR, ensuring readily available, tested, and optimized environments. The DLCs serve as the default execution environments for Amazon SageMaker jobs, abstracting away complex dependency management and providing a consistent, high-performance runtime.

Quick Start & Requirements

  • Primary Install/Run: Building custom images involves cloning the repository, setting AWS environment variables, logging into ECR, installing requirements (pip install -r src/requirements.txt), and running build scripts (python src/main.py ...).
  • Prerequisites: An AWS account with appropriate IAM permissions (e.g., AmazonEC2ContainerRegistryFullAccess, AmazonSageMakerFullAccess), Docker client, and Python 3.
  • Setup Time: Initial image builds can be time-consuming due to downloading base layers; subsequent builds are faster.
  • Documentation: Links to available images and SageMaker integration details are mentioned.

Highlighted Details

  • Optimized environments featuring TensorFlow, PyTorch, MXNet, CUDA, and Intel MKL.
  • Seamless integration with Amazon SageMaker for training, inference, and batch transforms.
  • Tested compatibility across Amazon EC2, ECS, and EKS.
  • Extensive local testing framework using pytest for various AWS deployment scenarios (EC2, ECS, EKS, SageMaker local/remote).
  • Supports customization of Dockerfiles, adding artifacts to the build context, and incorporating custom packages.

Licensing & Compatibility

The core project is licensed under the Apache-2.0 License. However, specific components like smdistributed.dataparallel and smdistributed.modelparallel are released under the AWS Customer Agreement. The Apache-2.0 license is generally permissive for commercial use.

Limitations & Caveats

Amazon SageMaker does not support tensorflow_inference py2 images. Building images for the first time requires downloading base layers, which can be time-intensive. Setting up the environment necessitates significant AWS IAM permissions.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
28
Issues (30d)
4
Star History
10 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wei-Lin Chiang Wei-Lin Chiang(Cofounder of LMArena), and
13 more.

awesome-tensor-compilers by merrymercy

0.1%
3k
Curated list of tensor compiler projects and papers
Created 5 years ago
Updated 1 year ago
Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

BIG-bench by google

0.1%
3k
Collaborative benchmark for probing and extrapolating LLM capabilities
Created 5 years ago
Updated 1 year ago
Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
14 more.

simpletransformers by ThilinaRajapakse

0%
4k
Rapid NLP task implementation
Created 6 years ago
Updated 4 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

text-to-text-transfer-transformer by google-research

0.1%
6k
Unified text-to-text transformer for NLP research
Created 6 years ago
Updated 2 days ago
Starred by Vaibhav Nivargi Vaibhav Nivargi(Cofounder of Moveworks), Chuan Li Chuan Li(Chief Scientific Officer at Lambda), and
5 more.

awesome-mlops by visenger

0.1%
14k
Curated MLOps knowledge hub
Created 5 years ago
Updated 1 year ago
Feedback? Help us improve.