Tree-boosting combined with Gaussian process and mixed effects models
Top 53.8% on sourcepulse
GPBoost combines tree-boosting with Gaussian process and mixed effects models to improve prediction accuracy and model flexibility. It targets data scientists and researchers working with tabular data who need to incorporate non-linearities, complex interactions, and spatial or grouped dependencies, offering a unified framework that bridges the gap between traditional boosting and latent Gaussian models.
How It Works
GPBoost models the response variable as a sum of a non-linear function (an ensemble of trees) and latent Gaussian effects (Gaussian processes or grouped random effects). This approach leverages the high predictive power of tree-boosting for fixed effects while incorporating the dependency structures and uncertainty quantification capabilities of Gaussian processes and mixed effects models. The algorithms iteratively learn covariance parameters and update the tree ensemble using gradient and Newton boosting steps.
Quick Start & Requirements
pip install gpboost
or R CMD INSTALL gpboost_0.1.0.tar.gz
(from source).Highlighted Details
Maintenance & Community
The project is primarily developed by Fabio Sigrist. Companion articles were published in JMLR and TPAMI in October 2022. Open issues on GitHub include requests for ONNX conversion, multivariate models, areal models, multiclass classification, sample weights, and GPU support for GPs.
Licensing & Compatibility
Licensed under the Apache License 2.0. This permissive license allows for commercial use and integration into closed-source projects.
Limitations & Caveats
The project is under active development, with several features listed as "open issues" or methodological improvements, including GPU support, multivariate models, and specific spatial models like CAR/SAR.
1 week ago
1 day