fraudfinder  by GoogleCloudPlatform

End-to-end MLOps for real-time fraud detection

Created 3 years ago
250 stars

Top 100.0% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive lab series, "Fraudfinder," designed to guide users through building a real-time fraud detection system on Google Cloud. It targets engineers and researchers interested in end-to-end MLOps pipelines, offering a practical, hands-on approach to a critical use case. The primary benefit is a structured learning experience covering the entire Data to AI lifecycle within the Google Cloud ecosystem.

How It Works

Fraudfinder employs a series of Jupyter notebooks to demonstrate an end-to-end architecture. The process begins with ingesting historical payment transaction data from a data warehouse and live streams via Pub/Sub. It then covers exploratory data analysis (EDA), batch and streaming feature engineering, and feature ingestion into a feature store. Models are trained using these features, registered in a model registry, evaluated, and deployed to an endpoint for real-time inference. The series concludes with model monitoring, showcasing a complete MLOps workflow. This approach leverages Google Cloud's integrated services for a robust and scalable solution.

Quick Start & Requirements

Setup involves creating a Google Cloud project, enabling necessary APIs (notebooks.googleapis.com, aiplatform.googleapis.com, pubsub.googleapis.com, run.googleapis.com, cloudbuild.googleapis.com, dataflow.googleapis.com, bigquery.googleapis.com, artifactregistry.googleapis.com, iam.googleapis.com, cloudresourcemanager.googleapis.com), and configuring Pub/Sub subscriptions via Cloud Shell. A User-Managed Notebook instance on Vertex AI Workbench (Python 3) is required to run the provided notebooks. The repository is cloned into the JupyterLab environment. Specific IAM roles must be granted to default service accounts.

Highlighted Details

  • Demonstrates an end-to-end Data to AI journey specifically for real-time fraud detection.
  • Covers the full MLOps lifecycle, from data ingestion to model monitoring.
  • Utilizes key Google Cloud services including Vertex AI, Pub/Sub, BigQuery, and Feature Store.
  • Includes both batch and streaming feature engineering pipelines.

Maintenance & Community

No specific details regarding maintenance, community channels (e.g., Discord, Slack), or roadmap are provided in the README.

Licensing & Compatibility

The README does not specify a software license. Compatibility for commercial use or integration with closed-source systems is not detailed.

Limitations & Caveats

The setup requires active use of Google Cloud services, which may incur costs. The README does not mention any specific technical limitations, alpha status, or known bugs.

Health Check
Last Commit

1 month ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.