mli-resources  by h2oai

MLI resources for practicing data scientists

created 7 years ago
489 stars

Top 64.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides practical examples and resources for Machine Learning Interpretability (MLI), targeting data scientists who need to explain complex models to stakeholders or regulators. It offers hands-on demonstrations of techniques like LIME, LOCO, and partial dependence plots, aiming to bridge the gap between model accuracy and explainability.

How It Works

The project showcases MLI techniques through Jupyter notebooks, demonstrating their application with popular libraries like H2O and XGBoost. It emphasizes practical implementation and provides a Dockerfile for an isolated, reproducible environment, simplifying setup and dependency management for users.

Quick Start & Requirements

  • Installation: Via Docker (recommended) or manual installation.
  • Prerequisites: Docker, Anaconda Python 5.1.0, Java, H2O Python package, Git, GraphViz.
  • Setup: Dockerfile provided for a self-contained environment. Manual setup requires adding dependencies to the system path.
  • Resources: Links to notebooks, data acquisition instructions, and additional code examples are available.

Highlighted Details

  • Covers practical MLI techniques including Decision Tree Surrogates, LIME, LOCO, Partial Dependence, ICE, and Sensitivity Analysis.
  • Includes a dedicated section on testing explanation accuracy using simulated data.
  • Offers a wealth of supplementary materials: webinars, videos, booklets, conference presentations, and academic references.
  • Provides a Dockerfile for easy setup and reproducible execution of examples.

Maintenance & Community

The repository is maintained by H2O.ai's Machine Learning Interpretability team. Contributions are welcomed via pull requests.

Licensing & Compatibility

Content is available for use by citing H2O.ai or the original author(s). Specific license details for the code and content are not explicitly stated beyond usage permissions.

Limitations & Caveats

Some examples, like the Diabetes dataset use case, may have separate repositories with their own Dockerfiles. O'Reilly Media interactive notebooks require a Safari membership.

Health Check
Last commit

4 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.