ML model serving framework for efficient cloud deployment
Top 42.9% on sourcepulse
Mosec is a high-performance, flexible model serving framework designed for building ML-enabled backend services. It targets ML engineers and researchers who need to efficiently deploy trained models as APIs, offering dynamic batching and CPU/GPU pipeline support to maximize hardware utilization.
How It Works
Mosec leverages Rust for its web layer and task coordination, ensuring high performance and efficient CPU utilization via async I/O. It supports dynamic batching to aggregate requests for batched inference and allows for pipelined stages using multiple processes to handle mixed CPU/GPU/IO workloads. The framework is cloud-friendly, featuring model warmup, graceful shutdown, and Prometheus monitoring metrics, making it easily manageable by container orchestration systems like Kubernetes.
Quick Start & Requirements
pip install -U mosec
or conda install conda-forge::mosec
.Highlighted Details
max_batch_size
and max_wait_time
.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 day ago
1 day