aws-step-functions-data-science-sdk-python  by aws

SDK for building ML workflows/pipelines on AWS using Step Functions & SageMaker

created 5 years ago
292 stars

Top 91.4% on sourcepulse

GitHubView on GitHub
Project Summary

This SDK enables data scientists to build and orchestrate machine learning workflows on AWS using Python, integrating with Amazon SageMaker and AWS Step Functions. It simplifies the creation of complex ML pipelines by abstracting away the underlying AWS service configurations, allowing users to focus on the ML logic.

How It Works

The SDK provides a Pythonic interface to define ML workflows as a sequence of steps. These steps can represent various tasks like data processing, model training, or deployment, leveraging AWS services. Workflows are constructed locally in Python and then translated into the Amazon States Language, which is then deployed and executed on AWS Step Functions. This approach allows for serverless, scalable, and observable ML pipelines.

Quick Start & Requirements

  • Install via pip: pip install stepfunctions
  • Supported OS: Unix/Linux, Mac
  • Supported Python: 3.6+
  • Requires AWS account and appropriate IAM permissions for Step Functions and SageMaker.
  • Example notebooks are available in SageMaker or can be run locally after downloading from GitHub.
  • Official documentation: https://aws-step-functions-data-science-sdk.readthedocs.io/en/latest/

Highlighted Details

  • Python API for defining workflow steps and orchestrating AWS services.
  • Supports common ML tasks like training pipelines and model deployment.
  • Generates graphical representations and Amazon States Language definitions of workflows.
  • Can create and execute workflows directly from Python code and Jupyter notebooks.
  • Option to export workflows as AWS CloudFormation templates.

Maintenance & Community

  • Developed by AWS.
  • Community contributions are welcomed via pull requests. See CONTRIBUTING.md for details.
  • Documentation available on Read the Docs.

Licensing & Compatibility

  • Licensed under the Apache 2.0 License.
  • Compatible with commercial use and closed-source applications.

Limitations & Caveats

  • Workflow visualization is currently limited to Jupyter notebooks and not available in JupyterLab.
  • Signature verification process is detailed but requires manual GPG setup.
Health Check
Last commit

3 months ago

Responsiveness

1+ week

Pull Requests (30d)
1
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

langflow by langflow-ai

2.9%
93k
Visual tool for AI agent and workflow creation/deployment
created 2 years ago
updated 1 day ago
Feedback? Help us improve.