api-for-open-llm  by xusenlinzy

OpenAI-compatible API for open LLMs

created 2 years ago
2,452 stars

Top 19.3% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a unified backend API for various open-source large language models (LLMs), mimicking the OpenAI ChatGPT API. It targets developers and users who want to integrate diverse LLMs into their applications seamlessly, offering a ChatGPT-like experience with features like streaming responses and text embeddings.

How It Works

The project acts as a compatibility layer, translating requests to an OpenAI-compatible format and forwarding them to different LLM backends. It supports multiple popular LLMs (LLaMA, LLaMA-2, BLOOM, Falcon, Qwen, ChatGLM, etc.) and can be configured via environment variables to switch between models. It also integrates with vLLM for accelerated inference and concurrent request handling, and supports loading custom LoRA models.

Quick Start & Requirements

  • Install/Run: Typically run via Docker or direct Python execution. Example: docker run -d -p 3000:3000 -e OPENAI_API_KEY="sk-xxxx" -e BASE_URL="http://192.168.0.xx:80" yidadaa/chatgpt-next-web (for a frontend example).
  • Prerequisites: Python, potentially CUDA for GPU acceleration (vLLM support implies GPU usage). Specific model requirements vary.
  • Links: Streamlit Demo, ChatGPT-Next-Web, Dify.

Highlighted Details

  • Supports a wide array of LLMs including LLaMA-3, Qwen2, GLM-4V, and Code Qwen.
  • Offers text embedding model support (e.g., bge-large-zh, m3e-large).
  • Compatible with LangChain for LLM application development.
  • Enables seamless integration with existing ChatGPT-compatible frontends by modifying the OPENAI_API_BASE environment variable.

Maintenance & Community

The project shows active development with frequent updates for new models and features. Links to community resources like Discord/Slack are not explicitly provided in the README.

Licensing & Compatibility

Licensed under Apache 2.0, permitting commercial use and linking with closed-source projects.

Limitations & Caveats

While supporting many models, specific performance and compatibility may vary. The README does not detail hardware requirements for all supported models or provide explicit benchmarks.

Health Check
Last commit

10 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
28 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.