interpretable_machine_learning_with_python  by jphall663

Jupyter notebooks for interpretable ML model training, explanation, and debugging

created 7 years ago
681 stars

Top 50.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive set of Jupyter notebooks demonstrating techniques for building, explaining, and debugging interpretable machine learning models. It targets data scientists and analysts seeking to enhance transparency, accountability, and trustworthiness in AI systems, offering practical examples for regulatory compliance and stakeholder communication.

How It Works

The notebooks showcase a range of methods including monotonic XGBoost models, partial dependence (PDP) and individual conditional expectation (ICE) plots for model introspection, and Shapley explanations for generating reason codes. It also covers decision tree surrogates, disparate impact analysis for fairness, LIME for local explanations, and various sensitivity and residual analyses for model debugging and validation. This multi-faceted approach aims to demystify complex models, enabling users to understand, validate, and improve their accuracy, fairness, and security.

Quick Start & Requirements

  • Recommended: H2O Aquarium (free educational environment) at https://aquarium.h2o.ai.
  • Virtualenv: Requires Git, Anaconda Python 5.1.0+, Python 3.6, pip install -r requirements.txt.
  • Docker: Requires Docker. Build image with provided Dockerfile, run with docker run -i -t -p 8888:8888 iml:latest.
  • Manual: Requires Anaconda Python 5.1.0+, Java, H2O Python package, Git, XGBoost, GraphViz, Seaborn, Shap. All must be in system path.

Highlighted Details

  • Demonstrates monotonic constraints in XGBoost for regulatory compliance.
  • Provides methods for generating "reason codes" from Shapley values.
  • Includes disparate impact analysis for fairness testing.
  • Covers sensitivity and residual analysis for model debugging.

Maintenance & Community

The repository is maintained by jphall663. Further reading links to several relevant academic papers and articles on responsible AI and interpretability.

Licensing & Compatibility

The repository does not explicitly state a license. The use of libraries like XGBoost, H2O, and Shap implies compatibility with their respective licenses. Commercial use should be verified based on the specific licenses of the included libraries and any potential restrictions not detailed in the README.

Limitations & Caveats

The README explicitly states that the notebooks and associated materials should not be taken as legal compliance advice. Some installation methods (Virtualenv, Docker, Manual) are marked as "Advanced." Anaconda Python 5.1.0 is specified, which is an older version.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

lit by PAIR-code

0.0%
4k
Interactive ML model analysis tool for understanding model behavior
created 5 years ago
updated 5 days ago
Feedback? Help us improve.