testing-ml  by eugeneyan

ML testing examples

Created 5 years ago
263 stars

Top 97.0% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides minimal, practical examples for testing machine learning code, focusing on implementation correctness, learned behavior, and performance. It targets ML engineers and researchers seeking to integrate robust testing practices into their workflows, offering a clear methodology to ensure model reliability and quality.

How It Works

The project employs standard software engineering testing practices within an ML context. It utilizes pytest for unit tests, Coverage.py for code coverage, pylint for linting, and mypy for type checking. The testing strategy is categorized into pre-train tests (verifying core algorithms like Gini impurity/gain), post-train tests (checking learned behavior such as invariance, directional expectations, and overfitting), and evaluation tests (measuring performance metrics like accuracy, AUC ROC, training time, and serving latency). This layered approach ensures comprehensive validation of ML models.

Quick Start & Requirements

  • Installation: Clone the repository (git clone https://github.com/eugeneyan/testing-ml.git), navigate into the directory (cd testing-ml), and run make setup to establish the environment.
  • Execution: The test suite can be run using make check.
  • Prerequisites: Requires a standard Python environment. Dependencies are managed via the Makefile.
  • Resources: Links to accompanying articles on ML testing and project setup are provided within the README.

Highlighted Details

  • Includes tests for fundamental ML algorithms like Gini impurity and gain.
  • Validates model output shape, range, and checks for data leakage.
  • Demonstrates tests for overfitting, invariance, and directional expectations.
  • Features performance benchmarks: 95th percentile training time < 1.0 sec, 99th percentile serving latency < 0.004 sec.
  • Sets target performance metrics: test accuracy > 0.82 and AUC ROC > 0.84.

Maintenance & Community

The provided README does not contain specific details regarding maintainers, community channels (e.g., Discord, Slack), or project roadmaps.

Licensing & Compatibility

The README snippet does not specify the software license, making commercial use or closed-source integration compatibility unclear.

Limitations & Caveats

The examples are tailored to specific DecisionTree and RandomForest implementations within this repository and are demonstrated using the dummy_titanic dataset. It serves as an illustrative guide rather than a general-purpose ML testing library.

Health Check
Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

BIG-bench by google

0.1%
3k
Collaborative benchmark for probing and extrapolating LLM capabilities
Created 4 years ago
Updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

text-to-text-transfer-transformer by google-research

0.1%
6k
Unified text-to-text transformer for NLP research
Created 6 years ago
Updated 5 months ago
Feedback? Help us improve.