testing-ml by eugeneyan

ML testing examples

Created 5 years ago

265 stars

Top 96.5% on SourcePulse

View on GitHub

1 Expert Loves This Project

Eugene Yan

AI Scientist at AWS

Project Summary

This repository provides minimal, practical examples for testing machine learning code, focusing on implementation correctness, learned behavior, and performance. It targets ML engineers and researchers seeking to integrate robust testing practices into their workflows, offering a clear methodology to ensure model reliability and quality.

How It Works

The project employs standard software engineering testing practices within an ML context. It utilizes pytest for unit tests, Coverage.py for code coverage, pylint for linting, and mypy for type checking. The testing strategy is categorized into pre-train tests (verifying core algorithms like Gini impurity/gain), post-train tests (checking learned behavior such as invariance, directional expectations, and overfitting), and evaluation tests (measuring performance metrics like accuracy, AUC ROC, training time, and serving latency). This layered approach ensures comprehensive validation of ML models.

Quick Start & Requirements

Installation: Clone the repository (git clone https://github.com/eugeneyan/testing-ml.git), navigate into the directory (cd testing-ml), and run make setup to establish the environment.
Execution: The test suite can be run using make check.
Prerequisites: Requires a standard Python environment. Dependencies are managed via the Makefile.
Resources: Links to accompanying articles on ML testing and project setup are provided within the README.

Highlighted Details

Includes tests for fundamental ML algorithms like Gini impurity and gain.
Validates model output shape, range, and checks for data leakage.
Demonstrates tests for overfitting, invariance, and directional expectations.
Features performance benchmarks: 95th percentile training time < 1.0 sec, 99th percentile serving latency < 0.004 sec.
Sets target performance metrics: test accuracy > 0.82 and AUC ROC > 0.84.

Maintenance & Community

The provided README does not contain specific details regarding maintainers, community channels (e.g., Discord, Slack), or project roadmaps.

Licensing & Compatibility

The README snippet does not specify the software license, making commercial use or closed-source integration compatibility unclear.

Limitations & Caveats

The examples are tailored to specific DecisionTree and RandomForest implementations within this repository and are demonstrated using the dummy_titanic dataset. It serves as an illustrative guide rather than a general-purpose ML testing library.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days