TSB-AD  by thedatumorg

Benchmark and evaluate time-series anomaly detection

Created 2 years ago
270 stars

Top 95.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

TSB-AD addresses critical issues in time-series anomaly detection (TSAD) by providing a reliable benchmark. It targets researchers and practitioners with a meticulously curated, large-scale dataset, a robust evaluation metric (VUS-PR), and a comprehensive algorithm suite. This enables fair, reproducible comparisons, challenging conventional wisdom and advancing the field.

How It Works

The project tackles TSAD challenges via three aspects: (i) Dataset Integrity, featuring 1070 high-quality time series from 40 diverse datasets, curated with human perception and model interpretation; (ii) Measure Reliability, identifying VUS-PR as a superior evaluation metric to mitigate biases; and (iii) Comprehensive Benchmarking, evaluating 40 algorithms (statistical, neural, foundation models) with unified setups and hyperparameter tuning. This facilitates objective algorithm assessment.

Quick Start & Requirements

  • Primary Install: pip install TSB-AD
  • Prerequisites: Python 3.8-3.12. Specific PyTorch may be needed (e.g., conda install pytorch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 pytorch-cuda=12.1 -c pytorch -c nvidia). Foundation model dependencies vary.
  • Datasets: Download TSB-AD-U (https://www.thedatum.org/datasets/TSB-AD-U.zip) and TSB-AD-M (https://www.thedatum.org/datasets/TSB-AD-M.zip).
  • Docs/Demo: Tutorial notebooks available. Project homepage: https://www.thedatum.org/.

Highlighted Details

  • Comprises 1070 high-quality time series across 40 diverse datasets, significantly expanding existing benchmarks.
  • Benchmarks 40 anomaly detection algorithms: statistical, neural network, and foundation model architectures.
  • Promotes VUS-PR as a reliable and accurate evaluation measure for time-series anomaly detection.
  • Features an active leaderboard supporting community submissions and model comparisons.

Maintenance & Community

Contact via email (Qinghua Liu, John Paparrizos) or GitHub Issues. Encourages community contributions, especially new algorithms via pull requests. Recent updates indicate ongoing development.

Licensing & Compatibility

Dataset preprocessing and curation steps are Apache 2.0 licensed. The core TSB-AD package license is not explicitly stated, requiring clarification for commercial use or closed-source integration.

Limitations & Caveats

The README omits explicit limitations, bugs, or alpha status. Foundation model dependency installation may be complex. Individual dataset licenses are referenced separately.

Health Check
Last Commit

4 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
2
Star History
16 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.