Open-source API server for text completion
Top 31.4% on sourcepulse
Basaran provides an open-source, OpenAI-compatible API for serving Hugging Face Transformers text generation models, enabling developers to easily swap proprietary LLM services with self-hosted open-source alternatives without code modifications. It targets developers and researchers looking to leverage the latest open-source LLMs in their applications with a familiar API interface and streaming capabilities.
How It Works
Basaran acts as a middleware, translating OpenAI API requests into Hugging Face Transformers model calls. It supports various decoding strategies, handles both decoder-only and encoder-decoder architectures, and includes a robust detokenizer. Key advantages include its OpenAI API compatibility, enabling seamless integration with existing tools and libraries, and its support for multi-GPU deployment and quantization for performance optimization.
Quick Start & Requirements
docker run -p 80:80 -e MODEL=user/repo hyperonym/basaran:X.Y.Z
(replace X.Y.Z with the latest version).Highlighted Details
Maintenance & Community
The project is open-source, with contributions welcomed via issues. Further details on contributing are available in CONTRIBUTING.md
.
Licensing & Compatibility
Limitations & Caveats
Basaran currently does not support the model
parameter in completions requests (though it's required by OpenAI clients, any string will work). The chat API is noted as difficult to unify due to varying model-specific chat history formats, recommending pre-formatting prompts for the completion API.
1 year ago
Inactive