Discover and explore top open-source AI tools and projects—updated daily.
mljarAutomated ML pipelines for tabular data
Top 14.6% on SourcePulse
Automated Machine Learning (AutoML) for tabular data, mljar-supervised aims to significantly reduce the time data scientists spend on repetitive tasks like data preprocessing, model selection, hyperparameter tuning, and report generation. It provides a comprehensive framework that abstracts these complexities, enabling users to build, understand, and deploy ML models more efficiently.
How It Works
This Python package employs a multi-modal approach, offering distinct modes (Explain, Perform, Compete, Optuna) tailored to different user needs. It integrates a wide array of algorithms, including tree-based models (Random Forest, LightGBM, XGBoost), linear models, and neural networks. Core functionalities include advanced feature engineering (e.g., Golden Features, text/time transformations), sophisticated hyperparameter optimization via random search with hill climbing or the Optuna framework, and robust ensembling techniques like greedy algorithms and stacking. A key differentiator is its deep focus on model explainability, providing detailed insights through decision tree visualizations, SHAP values, and permutation importance, all automatically compiled into comprehensive Markdown reports.
Quick Start & Requirements
pip install mljar-supervisedHighlighted Details
Maintenance & Community
The project is developed by MLJAR. Specific details regarding active contributors, community channels (like Discord/Slack), or sponsorships are not explicitly detailed in the provided README.
Licensing & Compatibility
Limitations & Caveats
The README does not explicitly list known limitations or alpha status. For the Optuna mode, it's noted that only the best model is saved after tuning, not intermediate models explored during the search.
2 weeks ago
Inactive
mlfoundations
minimaxir
openai
argilla-io
kelvins
wandb