Discover and explore top open-source AI tools and projects—updated daily.
tensorchordInference server for open-source LLMs, offering an OpenAI-compatible API
Top 94.0% on SourcePulse
Modelz LLM provides an OpenAI-compatible API for serving various open-source large language models, including LLaMA, Vicuna, and ChatGLM. It targets developers and researchers who want to easily deploy and interact with these models in local or cloud environments using familiar tools like the OpenAI Python SDK or LangChain.
How It Works
Modelz LLM acts as an inference server, abstracting the complexities of loading and running different LLMs. It leverages the Mosec inference engine and FastChat for prompt generation, offering a unified interface to diverse models. This approach allows users to switch between models seamlessly without altering their application code, benefiting from a consistent API.
Quick Start & Requirements
pip install modelz-llm or pip install git+https://github.com/tensorchord/modelz-llm.git[gpu]modelz-llm -m bigscience/bloomz-560m --device cpuHighlighted Details
/completions, /chat/completions, and /embeddings.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The specific license is not detailed, which may impact commercial use. The README lists recommended GPUs for specific models, implying that performance or even functionality might be constrained on less powerful hardware.
2 years ago
Inactive
eastriverlee
xusenlinzy
xorbitsai
bentoml
abetlen