sagemaker-inference-toolkit  by aws

SDK for serving ML models in Docker containers via SageMaker

Created 6 years ago
407 stars

Top 71.6% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This toolkit provides a standardized way to serve machine learning models within Docker containers for deployment on Amazon SageMaker. It targets ML engineers and data scientists who need to package custom inference logic for SageMaker endpoints, simplifying the deployment process and ensuring consistent runtime environments.

How It Works

The toolkit leverages the Multi Model Server (MMS) as its core serving stack. Users define custom inference handlers (implementing model_fn, input_fn, predict_fn, output_fn) to manage model loading, data preprocessing, prediction, and postprocessing. These handlers are then integrated into a HandlerService and an entrypoint script that starts the MMS server, enabling flexible and framework-agnostic model serving.

Quick Start & Requirements

  • Install via pip install multi-model-server sagemaker-inference within a Dockerfile.
  • Requires a Docker environment.
  • Example Dockerfile and inference handler implementations are provided.

Highlighted Details

  • Built on Multi Model Server (MMS) for robust model serving.
  • Supports custom inference logic for various ML frameworks.
  • Enables deployment to SageMaker endpoints with custom containers.
  • Provides default handlers for common data formats (JSON, CSV, NPZ).

Maintenance & Community

  • Developed by Amazon Web Services (AWS).
  • Contribution guidelines are available for community involvement.

Licensing & Compatibility

  • Licensed under the Apache 2.0 License.
  • Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The provided PyTorch example handler explicitly raises NotImplementedError for model_fn, requiring users to implement custom model loading logic for PyTorch models.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Eugene Yan Eugene Yan(AI Scientist at AWS), Jared Palmer Jared Palmer(Ex-VP AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), and
4 more.

seldon-core by SeldonIO

0.2%
5k
MLOps framework for production model deployment on Kubernetes
Created 7 years ago
Updated 13 hours ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

serve by pytorch

0.1%
4k
Serve, optimize, and scale PyTorch models in production
Created 6 years ago
Updated 1 month ago
Feedback? Help us improve.