mli-resources  by h2oai

MLI resources for practicing data scientists

Created 8 years ago
489 stars

Top 63.1% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides practical examples and resources for Machine Learning Interpretability (MLI), targeting data scientists who need to explain complex models to stakeholders or regulators. It offers hands-on demonstrations of techniques like LIME, LOCO, and partial dependence plots, aiming to bridge the gap between model accuracy and explainability.

How It Works

The project showcases MLI techniques through Jupyter notebooks, demonstrating their application with popular libraries like H2O and XGBoost. It emphasizes practical implementation and provides a Dockerfile for an isolated, reproducible environment, simplifying setup and dependency management for users.

Quick Start & Requirements

  • Installation: Via Docker (recommended) or manual installation.
  • Prerequisites: Docker, Anaconda Python 5.1.0, Java, H2O Python package, Git, GraphViz.
  • Setup: Dockerfile provided for a self-contained environment. Manual setup requires adding dependencies to the system path.
  • Resources: Links to notebooks, data acquisition instructions, and additional code examples are available.

Highlighted Details

  • Covers practical MLI techniques including Decision Tree Surrogates, LIME, LOCO, Partial Dependence, ICE, and Sensitivity Analysis.
  • Includes a dedicated section on testing explanation accuracy using simulated data.
  • Offers a wealth of supplementary materials: webinars, videos, booklets, conference presentations, and academic references.
  • Provides a Dockerfile for easy setup and reproducible execution of examples.

Maintenance & Community

The repository is maintained by H2O.ai's Machine Learning Interpretability team. Contributions are welcomed via pull requests.

Licensing & Compatibility

Content is available for use by citing H2O.ai or the original author(s). Specific license details for the code and content are not explicitly stated beyond usage permissions.

Limitations & Caveats

Some examples, like the Diabetes dataset use case, may have separate repositories with their own Dockerfiles. O'Reilly Media interactive notebooks require a Safari membership.

Health Check
Last Commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Travis Addair Travis Addair(Cofounder of Predibase), and
4 more.

alibi by SeldonIO

0.1%
3k
Python library for ML model inspection and interpretation
Created 6 years ago
Updated 15 hours ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Gabriel Almeida Gabriel Almeida(Cofounder of Langflow), and
5 more.

lit by PAIR-code

0.1%
4k
Interactive ML model analysis tool for understanding model behavior
Created 5 years ago
Updated 3 weeks ago
Feedback? Help us improve.