sagemaker-inference-toolkit by aws

SDK for serving ML models in Docker containers via SageMaker

Created 6 years ago

410 stars

Top 71.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Philipp Schmid

DevRel at Google DeepMind

Project Summary

This toolkit provides a standardized way to serve machine learning models within Docker containers for deployment on Amazon SageMaker. It targets ML engineers and data scientists who need to package custom inference logic for SageMaker endpoints, simplifying the deployment process and ensuring consistent runtime environments.

How It Works

The toolkit leverages the Multi Model Server (MMS) as its core serving stack. Users define custom inference handlers (implementing model_fn, input_fn, predict_fn, output_fn) to manage model loading, data preprocessing, prediction, and postprocessing. These handlers are then integrated into a HandlerService and an entrypoint script that starts the MMS server, enabling flexible and framework-agnostic model serving.

Quick Start & Requirements

Install via pip install multi-model-server sagemaker-inference within a Dockerfile.
Requires a Docker environment.
Example Dockerfile and inference handler implementations are provided.

Highlighted Details

Built on Multi Model Server (MMS) for robust model serving.
Supports custom inference logic for various ML frameworks.
Enables deployment to SageMaker endpoints with custom containers.
Provides default handlers for common data formats (JSON, CSV, NPZ).

Maintenance & Community

Developed by Amazon Web Services (AWS).
Contribution guidelines are available for community involvement.

Licensing & Compatibility

Licensed under the Apache 2.0 License.
Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The provided PyTorch example handler explicitly raises NotImplementedError for model_fn, requiring users to implement custom model loading logic for PyTorch models.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days