BentoML by bentoml

Framework for serving AI apps and models

Created 6 years ago

8,355 stars

Top 6.2% on SourcePulse

View on GitHub

8 Experts Love This Project

Cofounder of Lightning AI

Elvis Saravia

Founder of DAIR.AI

and 4 more!

Project Summary

BentoML is a Python framework for building and serving AI applications, designed to simplify the creation of REST APIs for any machine learning model. It targets AI/ML engineers and developers who need to deploy models efficiently, offering features like automatic Docker containerization, dependency management, and optimized inference serving.

How It Works

BentoML abstracts the complexities of model serving by allowing users to define inference logic within Python classes and functions, decorated with @bentoml.service and @bentoml.api. It automatically handles dependency packaging, environment replication, and API server generation. Key optimizations include dynamic batching, model parallelism, and multi-model orchestration, aiming to maximize hardware utilization for high-performance inference.

Quick Start & Requirements

Install: pip install -U bentoml
Prerequisites: Python ≥3.9. Additional dependencies (e.g., torch, transformers) are specified per service.
Local Run: bentoml serve
Docker Deployment: bentoml build then bentoml containerize and docker run.
Documentation: https://docs.bentoml.org/en/latest/
Examples: https://github.com/bentoml/BentoML/tree/main/examples

Highlighted Details

Supports a wide range of AI frameworks, modalities, and runtimes.
Offers advanced features like multi-stage pipelines, multi-model inference graphs, and adaptive batching.
Provides seamless local development and debugging with production-ready deployment options.
Includes BentoCloud for simplified cloud deployment and scaling.

Maintenance & Community

Active community with a Slack channel for support and discussion.
Open to contributions via GitHub Issues and pull requests.
Usage tracking is enabled by default but can be opted out.

Licensing & Compatibility

License: Apache License 2.0.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

The framework collects anonymous usage data by default, which users can opt out of. While it supports many frameworks, specific model or runtime integrations might require custom configurations or additional dependencies.

Health Check

Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

68 stars in the last 30 days