efficient-dl-systems  by mryab

Course materials for efficient deep learning systems

created 3 years ago
869 stars

Top 42.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides comprehensive course materials for "Efficient Deep Learning Systems," targeting students and practitioners interested in optimizing deep learning workflows. It covers essential topics from GPU architecture and CUDA to distributed training, LLM inference, and deployment, offering practical insights and code examples for enhancing performance and efficiency.

How It Works

The course material is structured around weekly lectures and seminars, delving into core concepts and practical applications. It emphasizes hands-on experience with tools like PyTorch, DVC, Weights & Biases, and Triton, demonstrating techniques such as mixed-precision training, data parallelism, gradient checkpointing, and advanced inference optimizations like KV caching and speculative decoding.

Quick Start & Requirements

  • Installation: Primarily involves cloning the repository and setting up a Python environment. Specific instructions for seminar code will be provided within each week's materials.
  • Prerequisites: Python 3.x, PyTorch, and potentially other libraries like DVC, Weights & Biases, and CUDA-enabled GPU for practical exercises.
  • Resources: Requires a development environment with Python and standard data science libraries. GPU acceleration is highly recommended for many seminar exercises.
  • Links: Past versions are available for historical context.

Highlighted Details

  • Covers a broad spectrum of efficiency techniques, from low-level CUDA operations to high-level distributed training strategies.
  • Includes practical seminars on experiment tracking, model versioning, testing, and profiling tools.
  • Features in-depth modules on LLM inference optimizations and efficient model deployment.
  • Addresses both training and inference efficiency, providing a holistic view of DL system optimization.

Maintenance & Community

The repository is associated with HSE University and Yandex School of Data Analysis, with contributions from multiple staff members. The 2025 branch indicates ongoing development and updates.

Licensing & Compatibility

The repository content is typically licensed under permissive terms, but specific licensing for code snippets or datasets should be verified within the respective directories. Compatibility is generally with standard Python environments and deep learning frameworks.

Limitations & Caveats

The materials are designed for a structured course, and self-study might require additional context or instructor guidance. Specific seminar code may have evolving dependencies or require specific hardware configurations (e.g., GPUs) for optimal execution.

Health Check
Last commit

3 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
56 stars in the last 90 days

Explore Similar Projects

Starred by David Cournapeau David Cournapeau(Author of scikit-learn), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
4 more.

lectures by gpu-mode

0.4%
5k
Lecture series for GPU-accelerated computing
created 1 year ago
updated 1 month ago
Feedback? Help us improve.