sqlflow  by sql-machine-learning

Compiler for AI workflows using SQL

Created 7 years ago
5,180 stars

Top 9.6% on SourcePulse

GitHubView on GitHub
Project Summary

SQLFlow is a compiler that translates extended SQL syntax into distributed machine learning workflows for Kubernetes. It aims to empower users with SQL skills to develop advanced AI applications by unifying data management and machine learning, reducing the need for specialized programming languages and complex toolchains.

How It Works

SQLFlow extends standard SQL with clauses for AI jobs like training, prediction, and evaluation. It compiles these extended SQL statements into Argo workflows, which are then executed on a Kubernetes cluster. This approach allows for sophisticated ML models (e.g., TensorFlow, XGBoost) to be defined and managed using familiar SQL syntax, abstracting away the complexities of distributed system orchestration and ML framework integration.

Quick Start & Requirements

Highlighted Details

  • Supports multiple data sources (MySQL, TiDB, Hive, MaxCompute) and ML toolkits (TensorFlow, Keras, XGBoost).
  • Enables advanced ML configurations directly within SQL, such as specifying model architectures and hyperparameters.
  • Compiles SQL to Argo workflows for distributed execution on Kubernetes.
  • Offers interactive examples for DNN classification and XGBoost regression.

Maintenance & Community

  • Active development with community contributions encouraged.
  • Roadmap available for future plans and feature requests.
  • Feedback channel via GitHub Issues.

Licensing & Compatibility

  • Apache License 2.0.
  • Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The project is described as a compiler to Kubernetes workflows, implying a dependency on Kubernetes infrastructure. While supporting multiple ML frameworks, the depth of integration and specific version compatibility for each may require further investigation.

Health Check
Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.