inference  by xorbitsai

Model serving library for language, speech, and multimodal models

created 2 years ago
8,311 stars

Top 6.3% on sourcepulse

GitHubView on GitHub
Project Summary

Xorbits Inference (Xinference) is a versatile library for serving large language, speech recognition, and multimodal models, enabling developers to easily integrate various open-source AI models into their applications. It aims to simplify model deployment and inference, offering flexibility for researchers, developers, and data scientists to utilize cutting-edge AI without vendor lock-in.

How It Works

Xinference provides a unified serving layer that supports multiple inference engines (like vLLM, GGML, TensorRT, MLX) and heterogeneous hardware utilization (CPU, GPU, Apple Silicon). It exposes an OpenAI-compatible RESTful API, along with RPC, CLI, and WebUI interfaces, facilitating seamless integration with other tools and frameworks like LangChain and LlamaIndex. Its distributed deployment capabilities allow models to run across multiple devices or machines.

Quick Start & Requirements

Highlighted Details

  • Supports a wide range of model types: LLMs, speech recognition, multimodal, and text embedding.
  • Offers OpenAI-compatible API with Function Calling support.
  • Features distributed deployment across multiple nodes and heterogeneous hardware utilization.
  • Integrates with popular AI frameworks and platforms like LangChain, LlamaIndex, Dify, and Chatbox.

Maintenance & Community

The project is actively maintained with recent updates and contributions. Community engagement is encouraged via Discord and Twitter.

Licensing & Compatibility

The project is licensed under the Apache License 2.0, which permits commercial use and integration with closed-source applications.

Limitations & Caveats

While Xinference supports numerous backends and platforms, specific engine performance and compatibility may vary. The project is under active development, and some features might be experimental.

Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
46
Issues (30d)
147
Star History
650 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.