sqlflow  by sql-machine-learning

Compiler for AI workflows using SQL

created 6 years ago
5,169 stars

Top 9.8% on sourcepulse

GitHubView on GitHub
Project Summary

SQLFlow is a compiler that translates extended SQL syntax into distributed machine learning workflows for Kubernetes. It aims to empower users with SQL skills to develop advanced AI applications by unifying data management and machine learning, reducing the need for specialized programming languages and complex toolchains.

How It Works

SQLFlow extends standard SQL with clauses for AI jobs like training, prediction, and evaluation. It compiles these extended SQL statements into Argo workflows, which are then executed on a Kubernetes cluster. This approach allows for sophisticated ML models (e.g., TensorFlow, XGBoost) to be defined and managed using familiar SQL syntax, abstracting away the complexities of distributed system orchestration and ML framework integration.

Quick Start & Requirements

Highlighted Details

  • Supports multiple data sources (MySQL, TiDB, Hive, MaxCompute) and ML toolkits (TensorFlow, Keras, XGBoost).
  • Enables advanced ML configurations directly within SQL, such as specifying model architectures and hyperparameters.
  • Compiles SQL to Argo workflows for distributed execution on Kubernetes.
  • Offers interactive examples for DNN classification and XGBoost regression.

Maintenance & Community

  • Active development with community contributions encouraged.
  • Roadmap available for future plans and feature requests.
  • Feedback channel via GitHub Issues.

Licensing & Compatibility

  • Apache License 2.0.
  • Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The project is described as a compiler to Kubernetes workflows, implying a dependency on Kubernetes infrastructure. While supporting multiple ML frameworks, the depth of integration and specific version compatibility for each may require further investigation.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
31 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
7 more.

mindsdb by mindsdb

0.5%
35k
AI query engine for federated data sources
created 7 years ago
updated 1 day ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ben Firshman Ben Firshman(Cofounder of Replicate), and
6 more.

Made-With-ML by GokuMohandas

0.4%
41k
ML course for production-grade applications
created 6 years ago
updated 11 months ago
Feedback? Help us improve.