ml_privacy_meter  by privacytrustlab

Privacy auditing library for assessing data privacy risks in ML models

Created 5 years ago
675 stars

Top 50.0% on SourcePulse

GitHubView on GitHub
Project Summary

Privacy Meter is an open-source library designed to audit data privacy in statistical and machine learning algorithms, targeting researchers and practitioners in sensitive domains like healthcare and finance. It provides quantitative assessments of privacy risks using state-of-the-art membership inference attacks, helping organizations comply with data protection regulations like GDPR.

How It Works

Privacy Meter employs a configuration-driven approach using YAML files to specify models, datasets, and privacy games. It supports multiple auditing methodologies, including membership inference, range membership inference, and dataset usage cardinality inference, to detect information leakage through training points, vicinity of training points, and dataset usage percentages. The library also allows auditing differential privacy (DP) lower bounds.

Quick Start & Requirements

  • Install via pip install -r requirements.txt or conda env create -f env.yaml.
  • Supports various datasets (CIFAR10, AG News, etc.) and models (CNN, MLP, GPT-2, etc.).
  • Integration with HuggingFace datasets and transformers is supported with custom file creation.
  • Custom training scripts can be integrated, with an example of a fast training library achieving high accuracy quickly.
  • For auditing pre-trained models, a specific directory structure and models_metadata.json file are required.
  • Official documentation and sample configurations are available.

Highlighted Details

  • Audits a wide range of ML algorithms including classification, regression, computer vision, and NLP.
  • Implements advanced auditing strategies beyond basic membership inference.
  • Integrates a fast training library achieving state-of-the-art training speed and accuracy.
  • Audit results include detailed attack outcomes, ROC curves, and timing logs.

Maintenance & Community

  • Developed at NUS Data Privacy and Trustworthy Machine Learning Lab.
  • Welcomes community contributions.
  • Discussion channel available via Slack.
  • Key research papers underpinning the library are cited.

Licensing & Compatibility

  • The README does not explicitly state the license.

Limitations & Caveats

  • The library's license is not specified in the README, which may impact commercial use or closed-source integration.
Health Check
Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.