mosec by mosecorg

ML model serving framework for efficient cloud deployment

Created 4 years ago

889 stars

Top 40.6% on SourcePulse

Project Summary

Mosec is a high-performance, flexible model serving framework designed for building ML-enabled backend services. It targets ML engineers and researchers who need to efficiently deploy trained models as APIs, offering dynamic batching and CPU/GPU pipeline support to maximize hardware utilization.

How It Works

Mosec leverages Rust for its web layer and task coordination, ensuring high performance and efficient CPU utilization via async I/O. It supports dynamic batching to aggregate requests for batched inference and allows for pipelined stages using multiple processes to handle mixed CPU/GPU/IO workloads. The framework is cloud-friendly, featuring model warmup, graceful shutdown, and Prometheus monitoring metrics, making it easily manageable by container orchestration systems like Kubernetes.

Quick Start & Requirements

Install via pip: pip install -U mosec or conda install conda-forge::mosec.
Requires Python 3.7+.
Building from source requires Rust.
See examples for detailed usage: https://mosecorg.github.io/mosec/

Highlighted Details

Dynamic batching with configurable max_batch_size and max_wait_time.
Supports multiple serialization formats (JSON, Msgpack) and custom mixins.
Enables multi-stage pipelines for complex workflows.
Includes Prometheus metrics for monitoring service health and performance.
Offers GPU offloading and customized GPU allocation.

Maintenance & Community

Active development with contributions from multiple authors.
Community support available via Discord.
Used by companies like TencentCloud, Modelz, and TensorChord.

Licensing & Compatibility

The license is not explicitly stated in the README.

Limitations & Caveats

The README does not specify the license, which could be a blocker for commercial adoption.
For multi-stage services, passing extremely large data between stages via default serialization might slow down the pipeline.

Health Check

Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days