api-for-open-llm by xusenlinzy

OpenAI-compatible API for open LLMs

Created 2 years ago

2,461 stars

Top 18.5% on SourcePulse

View on GitHub

2 Experts Love This Project

Junyang Lin

Core Maintainer at Alibaba Qwen

Binyuan Hui

Research Scientist at Alibaba Qwen

Project Summary

This project provides a unified backend API for various open-source large language models (LLMs), mimicking the OpenAI ChatGPT API. It targets developers and users who want to integrate diverse LLMs into their applications seamlessly, offering a ChatGPT-like experience with features like streaming responses and text embeddings.

How It Works

The project acts as a compatibility layer, translating requests to an OpenAI-compatible format and forwarding them to different LLM backends. It supports multiple popular LLMs (LLaMA, LLaMA-2, BLOOM, Falcon, Qwen, ChatGLM, etc.) and can be configured via environment variables to switch between models. It also integrates with vLLM for accelerated inference and concurrent request handling, and supports loading custom LoRA models.

Quick Start & Requirements

Install/Run: Typically run via Docker or direct Python execution. Example: docker run -d -p 3000:3000 -e OPENAI_API_KEY="sk-xxxx" -e BASE_URL="http://192.168.0.xx:80" yidadaa/chatgpt-next-web (for a frontend example).
Prerequisites: Python, potentially CUDA for GPU acceleration (vLLM support implies GPU usage). Specific model requirements vary.
Links: Streamlit Demo, ChatGPT-Next-Web, Dify.

Highlighted Details

Supports a wide array of LLMs including LLaMA-3, Qwen2, GLM-4V, and Code Qwen.
Offers text embedding model support (e.g., bge-large-zh, m3e-large).
Compatible with LangChain for LLM application development.
Enables seamless integration with existing ChatGPT-compatible frontends by modifying the OPENAI_API_BASE environment variable.

Maintenance & Community

The project shows active development with frequent updates for new models and features. Links to community resources like Discord/Slack are not explicitly provided in the README.

Licensing & Compatibility

Licensed under Apache 2.0, permitting commercial use and linking with closed-source projects.

Limitations & Caveats

While supporting many models, specific performance and compatibility may vary. The README does not detail hardware requirements for all supported models or provide explicit benchmarks.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days