AutoML tool for generating ML/DL models and Python code from CSV data
Top 23.8% on sourcepulse
This project provides an AutoML solution for tabular data, enabling users to generate high-performing machine learning or deep learning models with native Python code. It targets citizen data scientists and engineers, offering a zero-code interface to create optimized data transformation and prediction pipelines, abstracting complex preprocessing and modeling techniques.
How It Works
automl-gs generates raw Python code using Jinja templates and trains models in a subprocess, iterating through different hyperparameters. It infers data types, applies framework-specific ETL strategies (e.g., datetime encoding, text embeddings/vectorization), and constructs models using specified frameworks like TensorFlow/Keras or XGBoost. The best performing model and its associated pipeline code are saved, allowing for easy integration and prediction without ongoing dependency on the tool.
Quick Start & Requirements
pip3 install automl_gs
tensorflow
, xgboost
).automl_gs <csv_path> <target_field>
automl_gs titanic.csv Survived --framework xgboost --num_trials 1000
from automl_gs import automl_grid_search; automl_grid_search('titanic.csv', 'Survived')
Highlighted Details
model.py
, pipeline.py
, requirements.txt
, serialized encoders (JSON), and detailed metrics.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
5 years ago
Inactive