interpret by interpretml

ML interpretability Python package for glassbox models and blackbox explanations

Created 6 years ago

6,760 stars

Top 7.5% on SourcePulse

View on GitHub

7 Experts Love This Project

Gagan Bansal

Coauthor of AutoGen; Research Scientist at Microsoft Research

Luis Capelo

Cofounder of Lightning AI

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Casper Hansen

Author of AutoAWQ

and 3 more!

Project Summary

InterpretML is an open-source Python package designed to provide a unified framework for machine learning interpretability. It enables users to train inherently interpretable "glassbox" models and explain complex "blackbox" models, addressing needs in model debugging, feature engineering, fairness assessment, and regulatory compliance. The primary audience includes data scientists and researchers working with high-risk applications where understanding model behavior is critical.

How It Works

The core of InterpretML is the Explainable Boosting Machine (EBM), a novel approach that combines modern machine learning techniques like bagging and gradient boosting with traditional Generalized Additive Models (GAMs). This hybrid methodology allows EBMs to achieve accuracy comparable to state-of-the-art blackbox models (e.g., Random Forests, XGBoost) while providing exact, human-editable explanations. InterpretML also supports other interpretable models and blackbox explanation techniques like SHAP and LIME.

Quick Start & Requirements

Installation: pip install interpret or conda install -c conda-forge interpret
Prerequisites: Python 3.7+
Supported Platforms: Linux, macOS, Windows
Documentation: https://interpret.ml/docs/python/

Highlighted Details

EBMs offer accuracy on par with gradient boosted trees and random forests, with the added benefit of interpretability.
Supports native handling of string data types within pandas DataFrames and NumPy arrays.
Includes options for differentially private EBMs for enhanced data privacy.
Can scale to datasets with 100 million samples, with distributed options available on Azure SynapseML.

Maintenance & Community

Developed by a team including Samuel Jenkins, Harsha Nori, Paul Koch, and Rich Caruana.
Built upon numerous open-source packages including plotly, dash, scikit-learn, lime, and shap.
Contact: interpret@microsoft.com or GitHub issues.

Licensing & Compatibility

The project appears to be primarily licensed under the MIT License, facilitating commercial use and integration with closed-source projects.

Limitations & Caveats

While EBMs handle pairwise interactions by default, exploring higher-order interactions requires specific configuration.
For very large-scale workloads, distributed EBMs on Azure SynapseML are recommended.

Health Check

Last Commit

3 days ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

28 stars in the last 30 days