This repository provides open-source code and resources for the O'Reilly book "Practical Machine Learning for Computer Vision." It offers practical, end-to-end examples for image classification, detection, segmentation, and generation using Google Cloud's Vertex AI, targeting engineers and researchers looking to apply ML to visual data.
How It Works
The project guides users through building and deploying machine learning models for image understanding. It leverages Google Cloud Vertex AI for managed notebook environments, model training (including transfer learning with EfficientNet), deployment as REST services, and MLOps pipelines using Kubeflow. The approach emphasizes practical implementation and an end-to-end workflow from data preparation to prediction invocation.
Quick Start & Requirements
- Install/Run: Clone the repository and run provided Jupyter notebooks within a Google Cloud Vertex AI Workbench instance.
- Prerequisites: Google Cloud Project, Vertex AI Workbench instance (GPU recommended, e.g., Tesla T4), Cloud Storage bucket, Kubeflow Pipelines on GKE for MLOps.
- Setup: Initial setup involves creating a Vertex AI Workbench instance (approx. 10 mins), cloning the repo, and potentially setting up Kubeflow (approx. 5 mins).
- Links: Full Tour, Quick Tour
Highlighted Details
- End-to-end workflow covering data preparation (TFRecords), transfer learning, model deployment, and MLOps pipelines.
- Demonstrates image classification, object detection, segmentation, and generation tasks.
- Utilizes Google Cloud services like Vertex AI, Dataflow, and Kubeflow Pipelines.
- Includes specific guidance for using TPUs for certain notebooks.
Maintenance & Community
- Maintained by Google Cloud Platform.
- Feedback is encouraged via GitHub Issues.
Licensing & Compatibility
- The repository code is likely under a permissive license (e.g., Apache 2.0, common for Google Cloud examples), but the README does not explicitly state the license for the code. The book itself is copyrighted.
Limitations & Caveats
- Some notebooks may require significant computational resources (TPUs or multiple GPUs) or large datasets (12GB for RetinaNet).
- Specific notebooks are noted as broken (e.g., 6h TF Transform) or have version dependencies (e.g., TensorFlow 2.7+ for Unet segmentation).
- Out-of-memory errors are common if kernels are not properly managed.