SDK for serving ML models in Docker containers via SageMaker
Top 72.6% on sourcepulse
This toolkit provides a standardized way to serve machine learning models within Docker containers for deployment on Amazon SageMaker. It targets ML engineers and data scientists who need to package custom inference logic for SageMaker endpoints, simplifying the deployment process and ensuring consistent runtime environments.
How It Works
The toolkit leverages the Multi Model Server (MMS) as its core serving stack. Users define custom inference handlers (implementing model_fn
, input_fn
, predict_fn
, output_fn
) to manage model loading, data preprocessing, prediction, and postprocessing. These handlers are then integrated into a HandlerService
and an entrypoint script that starts the MMS server, enabling flexible and framework-agnostic model serving.
Quick Start & Requirements
pip install multi-model-server sagemaker-inference
within a Dockerfile.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The provided PyTorch example handler explicitly raises NotImplementedError
for model_fn
, requiring users to implement custom model loading logic for PyTorch models.
1 year ago
1 week