what-if-tool by PAIR-code

Interactive tool for ML model understanding and debugging

Created 7 years ago

987 stars

Top 37.5% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jeff Hammerbacher

Cofounder of Cloudera

Project Summary

The What-If Tool (WIT) addresses the challenge of understanding complex, "black-box" machine learning models by providing an interactive, no-code visual interface. It empowers ML researchers, developers, and even non-technical stakeholders to explore model behavior, performance, and fairness across datasets. WIT enables users to gain intuitive insights into model predictions and identify potential biases or unexpected outcomes without writing any code.

How It Works

WIT operates as a plugin for TensorBoard or an extension for Jupyter and Colab notebooks. Users load their trained ML models (TensorFlow Estimators, AI Platform models, or custom prediction functions) and datasets (TFRecord or CSV). The tool then facilitates interactive exploration through visualizations like Facets Dive and Overview, allowing users to slice, dice, and color data points by model predictions, performance metrics, or feature values. Users can directly edit individual data points, re-run inference, and observe the immediate impact on predictions and associated metrics, including feature attributions.

Quick Start & Requirements

Installation: For Jupyter/Colab: pip install witwidget followed by jupyter nbextension install --py --symlink --sys-prefix witwidget and jupyter nbextension enable --py --sys-prefix witwidget. For JupyterLab: pip install witwidget and jupyter labextension install wit-widget wit-widget.
Prerequisites: TensorFlow Serving is required for TensorBoard integration. Models can be TensorFlow Estimators, AI Platform models, or custom Python functions. Datasets can be TFRecord or CSV files.
Demos: A comprehensive set of web and Colab demos are available on the What-If Tool website, offering the quickest way to experience its capabilities. Links to specific demo setup commands using bazel run are provided for various datasets (e.g., UCI Census, CelebA, Iris).

Highlighted Details

Interactive Editing & Counterfactuals: Edit individual data points, re-run inference, and explore counterfactual examples by finding the most similar data points with different predictions.
Fairness & Performance Analysis: Investigate model fairness and performance across data subsets, adjust classification thresholds, and visualize ROC curves and confusion matrices.
Feature Attribution: Integrates with methods like SHAP and Integrated Gradients to visualize feature importance, enabling analysis and slicing based on attribution strength.
Model Comparison: Directly compare the predictions and performance of two different models side-by-side on the same dataset.
Rich Data Visualization: Leverages Facets Dive and Facets Overview for in-depth dataset exploration and visualization of inference results.

Maintenance & Community

The project provides links to development guides and release notes, indicating ongoing maintenance. Specific community channels (e.g., Slack, Discord) or major contributors are not detailed in the provided text.

Licensing & Compatibility

License information is not specified in the provided README content.

Limitations & Caveats

Using custom prediction functions with the --whatif-use-unsafe-custom-prediction flag is explicitly marked as "unsafe" due to the lack of sandboxing. When analyzing CSV files directly in TensorBoard without an associated model, data points become non-editable as there is no mechanism for re-inference. Compatibility with specific JupyterLab versions may require consulting external package manager details for jupyterlab-manager.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days