efficient-dl-systems by mryab

Course materials for efficient deep learning systems

Created 4 years ago

933 stars

Top 39.2% on SourcePulse

View on GitHub

2 Experts Love This Project

Vincent Weisser

Cofounder of Prime Intellect

Alexander Borzunov

Research Scientist at OpenAI

Project Summary

This repository provides comprehensive course materials for "Efficient Deep Learning Systems," targeting students and practitioners interested in optimizing deep learning workflows. It covers essential topics from GPU architecture and CUDA to distributed training, LLM inference, and deployment, offering practical insights and code examples for enhancing performance and efficiency.

How It Works

The course material is structured around weekly lectures and seminars, delving into core concepts and practical applications. It emphasizes hands-on experience with tools like PyTorch, DVC, Weights & Biases, and Triton, demonstrating techniques such as mixed-precision training, data parallelism, gradient checkpointing, and advanced inference optimizations like KV caching and speculative decoding.

Quick Start & Requirements

Installation: Primarily involves cloning the repository and setting up a Python environment. Specific instructions for seminar code will be provided within each week's materials.
Prerequisites: Python 3.x, PyTorch, and potentially other libraries like DVC, Weights & Biases, and CUDA-enabled GPU for practical exercises.
Resources: Requires a development environment with Python and standard data science libraries. GPU acceleration is highly recommended for many seminar exercises.
Links: Past versions are available for historical context.

Highlighted Details

Covers a broad spectrum of efficiency techniques, from low-level CUDA operations to high-level distributed training strategies.
Includes practical seminars on experiment tracking, model versioning, testing, and profiling tools.
Features in-depth modules on LLM inference optimizations and efficient model deployment.
Addresses both training and inference efficiency, providing a holistic view of DL system optimization.

Maintenance & Community

The repository is associated with HSE University and Yandex School of Data Analysis, with contributions from multiple staff members. The 2025 branch indicates ongoing development and updates.

Licensing & Compatibility

The repository content is typically licensed under permissive terms, but specific licensing for code snippets or datasets should be verified within the respective directories. Compatibility is generally with standard Python environments and deep learning frameworks.

Limitations & Caveats

The materials are designed for a structured course, and self-study might require additional context or instructor guidance. Specific seminar code may have evolving dependencies or require specific hardware configurations (e.g., GPUs) for optimal execution.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days