ollama_proxy_server by ParisNeo

Proxy server for load balancing and securing Ollama instances

Created 2 years ago

561 stars

Top 57.1% on SourcePulse

1 Expert Loves This Project

jmorganca

Cofounder of Ollama

Project Summary

This project provides a lightweight, secure proxy server for managing multiple Ollama instances, aimed at developers and users needing to scale their LLM deployments with enhanced security and load balancing. It offers improved responsiveness and centralized management for distributed Ollama backends.

How It Works

The proxy server employs a load-balancing strategy, routing incoming requests to the backend Ollama instance with the fewest active connections. It implements bearer token authentication for security and utilizes asynchronous logging to a CSV file without impacting performance. Connection pooling and proper forwarding of streaming responses are key architectural choices for efficient and responsive LLM interactions.

Quick Start & Requirements

Install from Source:

git clone https://github.com/ParisNeo/ollama_proxy_server.git
cd ollama_proxy_server
pip install -r requirements.txt
pip install .

Prerequisites: Python 3.11 or higher.
Docker: Available via docker build -t ollama_proxy_server . and docker run -p 8080:8080 -v $(pwd)/config.ini:/app/config.ini -v $(pwd)/authorized_users.txt:/app/authorized_users.txt ollama_proxy_server.
Configuration: Requires config.ini for backend URLs and authorized_users.txt for credentials.
Usage: Run with python main.py --config config.ini --users_list authorized_users.txt.
User Management: Use python add_user.py <username> <key>.
Docs: Repository

Highlighted Details

Load balancing across multiple Ollama servers.
Bearer token authentication with user:key format.
Asynchronous logging to CSV.
Connection pooling for faster backend communication.
Streaming support for Ollama responses.

Maintenance & Community

Developed by ParisNeo.
Contributions are welcome via pull requests.

Licensing & Compatibility

License: Apache 2.0.
Compatible with Python 3.11+.

Limitations & Caveats

The project is not yet published on PyPI. The CONTRIBUTING.md file is noted as "to be added."

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

1

Star History

17 stars in the last 30 days

Explore Similar Projects

NyaProxy by Nya-Foundation

API proxy for load balancing, securing, and monitoring API interactions

Created 9 months ago

Updated 6 months ago

Starred by

Chris Van Pelt

Chris Van Pelt(Cofounder of Weights & Biases) and

Ed Huang

Ed Huang(Cofounder of PingCAP).

llm-router by kcolemangt

Reverse proxy for routing chat/completions API requests to OpenAI-compatible LLMs

Created 1 year ago

Updated 9 months ago

dedalus-sdk-python by dedalus-labs

Python SDK for AI API interaction

Created 5 months ago

Updated 2 days ago

galah by 0x4D31

LLM-powered web honeypot for dynamic HTTP response generation

Created 2 years ago

Updated 5 months ago

StakeVladDracula by Herm-Studio

Reverse proxy for AI APIs

Created 1 year ago

Updated 5 months ago

Starred by

Ishaan Jaffer

Ishaan Jaffer(Cofounder of LiteLLM).

BricksLLM by bricks-cloud

AI gateway for LLM production use cases

Created 2 years ago

Updated 1 year ago

vercel-api-proxy by souying

Vercel reverse proxy for universal API access

Created 2 years ago

Updated 1 year ago

openai-forward by KenyonY

LLM forwarding service for efficient API proxying

Created 2 years ago

Updated 10 months ago

gpt-load by tbphp

Proxy server for OpenAI-compatible APIs

Created 7 months ago

Updated 1 month ago

siteproxy by netptop

Online proxy for bypassing internet restrictions

Created 5 years ago

Updated 5 months ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Max Howell

Max Howell(Author of Homebrew), and

31 more.

litellm by BerriAI

SDK/proxy for calling 100+ LLM APIs using the OpenAI format

Created 2 years ago

Updated 21 hours ago

hajimi by wyeeeee

FastAPI proxy for Google's Gemini API, designed for easy/secure access

Created 9 months ago

Updated 3 months ago

Feedback? Help us improve.