multi-model-server  by awslabs

CLI tool for serving deep learning models from any ML/DL framework

Created 8 years ago
1,016 stars

Top 36.8% on SourcePulse

GitHubView on GitHub
Project Summary

Multi Model Server (MMS) is a tool for serving deep learning models trained with any framework, providing HTTP endpoints for inference requests. It targets ML engineers and researchers needing a flexible, easy-to-use inference server, simplifying deployment and scaling.

How It Works

MMS utilizes a worker-based architecture, with each worker handling model inference. It supports automatic scaling of workers based on available CPU or GPU resources. Models are packaged into .mar archives, which contain the model artifacts and inference logic, allowing for easy distribution and deployment.

Quick Start & Requirements

  • Install: pip install multi-model-server
  • Prerequisites: Ubuntu, CentOS, or macOS; Python; pip; Java 8. MXNet (CPU: mxnet-mkl, GPU: mxnet-cu92mkl) must be installed separately.
  • Example: multi-model-server --start --models squeezenet=https://s3.amazonaws.com/model-server/model_archive_1.0/squeezenet_v1.1.mar
  • Docs: https://github.com/awslabs/multi-model-server/tree/master/docs

Highlighted Details

  • Supports models from any ML/DL framework.
  • Automatic scaling of workers to match CPU/GPU resources.
  • Model packaging into .mar archives for easy deployment.
  • Includes Dockerfiles for production deployments.

Maintenance & Community

  • Join Slack channel for community interaction.
  • Contributions via GitHub issues and pull requests are welcome.

Licensing & Compatibility

  • License: Apache License 2.0.
  • Compatibility: Suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

MMS does not provide built-in authentication, throttling, or SSL, requiring external solutions for production security. Default network access is restricted to localhost. Windows support is experimental.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Eugene Yan Eugene Yan(AI Scientist at AWS), Jared Palmer Jared Palmer(Ex-VP AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), and
4 more.

seldon-core by SeldonIO

0.2%
5k
MLOps framework for production model deployment on Kubernetes
Created 7 years ago
Updated 13 hours ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

serve by pytorch

0.1%
4k
Serve, optimize, and scale PyTorch models in production
Created 6 years ago
Updated 1 month ago
Feedback? Help us improve.