energy-forecasting  by iusztinpaul

MLOps course for building production-ready ML batch systems

Created 2 years ago
940 stars

Top 38.9% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a free, 7-step MLOps course focused on building and deploying an end-to-end ML batch system for energy consumption forecasting. It targets intermediate to advanced Machine Learning Engineers and Software Engineers transitioning to MLE, offering practical experience with production-ready ML systems.

How It Works

The course guides users through designing, building, training, serving, and monitoring a batch ML system. It emphasizes MLOps best practices, integrating tools like Hopsworks (feature store), Weights & Biases (experiment tracking), Docker, Airflow (orchestration), and GitHub Actions (CI/CD). The approach is modular, covering feature engineering, training pipelines with hyperparameter tuning, batch prediction, data validation with Great Expectations, and deployment to GCP.

Quick Start & Requirements

  • Install: Primarily uses Docker Compose for setup.
  • Prerequisites: Python 3.9 (Poetry 1.4.2 recommended), Docker, GCP account, Hopsworks account, Weights & Biases account.
  • Setup: Requires configuring .env files with API keys and credentials for various services. Estimated GCP deployment cost is ~$20.
  • Resources: 2.5 hours of reading and video material on Medium.
  • Links: Medium Lessons

Highlighted Details

  • End-to-end MLOps system design and implementation.
  • Integration of feature store, experiment tracking, model registry, and orchestration.
  • Deployment to Google Cloud Platform (GCP) with CI/CD via GitHub Actions.
  • Data validation using Great Expectations and web app visualization with FastAPI/Streamlit.
  • Use of Poetry for package management and a private PyPI server.

Maintenance & Community

The project is maintained by the author, with opportunities for community contributions via GitHub Issues or Pull Requests. Direct contact is available via LinkedIn.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive for commercial use and closed-source linking, provided attribution and license distribution are maintained.

Limitations & Caveats

The underlying energy data API is becoming obsolete, but a static dataset from 2020-2023 is provided. The course material is hosted on Medium, which may have a paywall. macOS M1/M2 users may encounter Poetry environment issues, with a provided script to mitigate them.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
4 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.