Python package for time series feature extraction
Top 5.8% on sourcepulse
tsfresh is a Python package designed for automatic feature extraction from time series data, targeting data scientists and researchers. It aims to reduce the time spent on manual feature engineering by extracting hundreds of descriptive features and then filtering out irrelevant ones using a statistically sound hypothesis testing framework, enabling more efficient model building and analysis.
How It Works
The package systematically extracts a wide array of features from time series, encompassing statistical, signal processing, and nonlinear dynamics measures. Its core innovation lies in a built-in, statistically rigorous feature selection mechanism based on hypothesis testing. This process identifies and retains features that are demonstrably relevant to the given regression or classification task, controlling the rate of irrelevant features.
Quick Start & Requirements
pip install tsfresh
conda create --name tsfresh__py_3.8 python=3.8 && conda activate tsfresh__py_3.8 && pip install tsfresh[matrixprofile]
docker pull nbraun/tsfresh
Highlighted Details
Maintenance & Community
The project has received funding from the German Federal Ministry of Education and Research. Contribution guidelines are available for those interested in expanding the library.
Licensing & Compatibility
The README does not explicitly state the license. However, the project's nature and typical open-source Python libraries suggest a permissive license, likely compatible with commercial use.
Limitations & Caveats
Reproducing features computed with older matrixprofile
calculators requires a specific Python 3.8 environment. The README implies a focus on supervised learning tasks for the filtering mechanism, though an unsupervised anomaly detection paper is cited.
2 weeks ago
Inactive