gpt-load by tbphp

Proxy server for OpenAI-compatible APIs

Created 7 months ago

5,728 stars

Top 8.7% on SourcePulse

Project Summary

This project provides a high-performance, Go-based proxy server for OpenAI-compatible APIs, designed to manage multiple API keys and distribute requests across them. It targets developers and teams needing to scale their AI model usage, offering features like automatic key rotation, intelligent blacklisting, and load balancing to ensure continuous availability and efficient resource utilization.

How It Works

GPT-Load acts as an intermediary, receiving requests and forwarding them to upstream API endpoints. It employs a round-robin strategy for distributing requests across configured API keys and multiple target URLs. The proxy intelligently blacklists keys that return persistent errors, distinguishing them from temporary issues to optimize key usage. It utilizes Go's concurrency primitives, zero-copy streaming, and atomic operations for high throughput and memory efficiency.

Quick Start & Requirements

Docker (Recommended):

docker pull ghcr.io/tbphp/gpt-load:latest
echo "sk-your-api-key" > keys.txt
docker run -d -p 3000:3000 -v $(pwd)/keys.txt:/app/keys.txt:ro --name gpt-load ghcr.io/tbphp/gpt-load:latest

Prerequisites: Go 1.21+ (for building from source), Docker.
Setup: Minimal setup time, especially with Docker.
Docs: GPT-Load中文文档 | English

Highlighted Details

Supports OpenAI, Azure OpenAI, Anthropic Claude, and any OpenAI-compatible API.
Intelligent blacklisting distinguishes permanent vs. temporary errors.
Real-time monitoring endpoints for health, stats, and blacklist status.
Zero-copy streaming and atomic operations for high performance.
Optional project-level Bearer token authentication.

Maintenance & Community

The project is actively maintained, with CI/CD pipelines defined in GitHub Actions. Community interaction channels are not explicitly mentioned in the README.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The README does not detail specific performance benchmarks or known limitations. While designed for production, advanced configurations or specific upstream API behaviors might require further testing.

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

134 stars in the last 30 days