ray-educational-materials by ray-project

Educational materials for scaling Python and ML workloads with Ray

Created 3 years ago

452 stars

Top 66.6% on SourcePulse

View on GitHub

2 Experts Love This Project

Robert Nishihara

Cofounder of Anyscale; Author of Ray

Philipp Moritz

Cofounder of Anyscale

Project Summary

This repository provides hands-on educational materials for learning and applying the Ray distributed computing framework to scale Python and machine learning workloads. It targets developers and researchers looking to efficiently handle tasks like computer vision, NLP, and time-series forecasting on distributed systems.

How It Works

The materials are structured into modules covering core Ray concepts such as remote functions (tasks), remote objects, and stateful actors. It then progresses to practical applications like scaling batch inference and model training, with specific examples for computer vision and LLMs. The approach emphasizes practical implementation and understanding of Ray's distributed primitives for building scalable ML applications.

Quick Start & Requirements

Installation: Follow instructions within individual module notebooks.
Prerequisites: Python, Ray library. Specific modules may require additional ML libraries (e.g., Hugging Face Transformers, PyTorch).
Resources: Requires a local machine or cluster environment where Ray can be installed and executed.
Links: Ray Documentation, Official Ray Site

Highlighted Details

Comprehensive coverage from Ray Core fundamentals to advanced use cases.
Includes specific examples for scaling CV, NLP (LLMs), and time-series forecasting.
Modules on LLM fine-tuning, distributed hyperparameter tuning, and serving with Ray Serve.
Covers Ray observability features like the Ray State API and Dashboard.

Maintenance & Community

Developed by Anyscale Inc.
Active community engagement through Slack, discussion boards, and meetups.
Contributions welcomed via GitHub issues for feature requests and bug reports.
Links: Ray Community Slack, GitHub Issues

Licensing & Compatibility

License: Apache License 2.0.
Compatibility: Permissive license allows for commercial use and integration with closed-source applications.

Limitations & Caveats

The materials are focused on demonstrating Ray's capabilities and assume a foundational understanding of Python and machine learning concepts. Some advanced modules might require significant computational resources for practical execution.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days