mlops-for-devopsย ย byย techiescamp

Hands-on MLOps guide for DevOps engineers

Created 1 month ago
262 stars

Top 97.0% on SourcePulse

GitHubView on GitHub
Project Summary

MLOps for DevOps Engineers provides a hands-on, project-based guide to Machine Learning Operations tailored for DevOps, Platform, and SRE engineers, requiring no prior ML background. Concepts are explained via familiar DevOps analogies, enabling effective operation of ML workloads in production by bridging the gap between ML and traditional infrastructure practices.

How It Works

This project flips the typical MLOps resource by focusing on infrastructure and operations for ML, not ML theory. It uses a project-based approach with a real-world employee attrition prediction use case to illustrate concepts. All components run on Kubernetes and Docker, leveraging familiar DevOps tooling. The core approach emphasizes building ML foundations locally, then transitioning to production-grade orchestration, model serving, and monitoring.

Quick Start & Requirements

Prerequisites include intermediate proficiency in Linux CLI, Docker, Kubernetes, and Git, with basic to intermediate AWS and basic Python (script reading/running) skills. No ML expertise is required, as the material teaches these concepts. The project is structured into phases and steps with detailed guides, implying setup within a Kubernetes/Docker environment.

Highlighted Details

  • The project covers three main tracks: Traditional ML (training, serving, automating, monitoring models on Kubernetes), Foundational Models (serving LLMs using vLLM, TGI, Ollama), and LLM-Powered DevOps (Kubernetes monitoring, RAG pipelines, agents).
  • Phase 1 focuses on local ML development and data pipelines, building a complete ML foundation from raw data to a trained, tested model.
  • Phase 2 addresses enterprise orchestration, aiming to replace manual workflows with production-grade systems for data versioning (DVC, S3), automated pipelines (Airflow on Kubernetes), and experiment tracking (MLflow).
  • The tech stack spans Python (Pandas, scikit-learn, XGBoost), FastAPI, KServe, MLflow, Kubeflow Pipelines, Prometheus, Grafana, Evidently AI, Kubernetes, Helm, GitHub Actions, and LLM serving tools like vLLM, TGI, and Ollama.

Maintenance & Community

No specific details on active contributors, sponsorships, or community channels (e.g., Discord/Slack) are provided in the README.

Licensing & Compatibility

The project employs a dual licensing model: Apache 2.0 for code (scripts, configs, manifests) and All Rights Reserved for content (README, guides, docs). Commercial use of content requires contacting contact@devopscube.com.

Limitations & Caveats

Several key tracks and phases are marked as 'In Progress' (๐Ÿ”„) or 'Planned' (๐Ÿ”œ), including Enterprise Orchestration, Monitoring & Observation, Foundational Models, LLM Serving & Scaling, and LLM-Powered DevOps, indicating ongoing development. The 'All Rights Reserved' content license may impose restrictions on commercial redistribution or use of documentation.

Health Check
Last Commit

3 days ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
39 stars in the last 30 days

Explore Similar Projects

Starred byย Chris Lattnerย Chris Lattner(Author of LLVM, Clang, Swift, Mojo, MLIR; Cofounder of Modular), Tobi Lutkeย Tobi Lutke(Cofounder of Shopify), and
13 more.

modularย byย modular

0.1%
26k
AI toolchain unifying fragmented AI deployment workflows
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.