ollama_proxy_server  by ParisNeo

Proxy server for load balancing and securing Ollama instances

created 1 year ago
469 stars

Top 65.7% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides a lightweight, secure proxy server for managing multiple Ollama instances, aimed at developers and users needing to scale their LLM deployments with enhanced security and load balancing. It offers improved responsiveness and centralized management for distributed Ollama backends.

How It Works

The proxy server employs a load-balancing strategy, routing incoming requests to the backend Ollama instance with the fewest active connections. It implements bearer token authentication for security and utilizes asynchronous logging to a CSV file without impacting performance. Connection pooling and proper forwarding of streaming responses are key architectural choices for efficient and responsive LLM interactions.

Quick Start & Requirements

  • Install from Source:
    git clone https://github.com/ParisNeo/ollama_proxy_server.git
    cd ollama_proxy_server
    pip install -r requirements.txt
    pip install .
    
  • Prerequisites: Python 3.11 or higher.
  • Docker: Available via docker build -t ollama_proxy_server . and docker run -p 8080:8080 -v $(pwd)/config.ini:/app/config.ini -v $(pwd)/authorized_users.txt:/app/authorized_users.txt ollama_proxy_server.
  • Configuration: Requires config.ini for backend URLs and authorized_users.txt for credentials.
  • Usage: Run with python main.py --config config.ini --users_list authorized_users.txt.
  • User Management: Use python add_user.py <username> <key>.
  • Docs: Repository

Highlighted Details

  • Load balancing across multiple Ollama servers.
  • Bearer token authentication with user:key format.
  • Asynchronous logging to CSV.
  • Connection pooling for faster backend communication.
  • Streaming support for Ollama responses.

Maintenance & Community

  • Developed by ParisNeo.
  • Contributions are welcome via pull requests.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatible with Python 3.11+.

Limitations & Caveats

The project is not yet published on PyPI. The CONTRIBUTING.md file is noted as "to be added."

Health Check
Last commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
1
Star History
50 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems).

bytewax by bytewax

0.3%
2k
Python framework for stateful stream processing
created 3 years ago
updated 4 months ago
Starred by Adam Wolff Adam Wolff(Claude Code Core; MTS at Anthropic), Samuel Colvin Samuel Colvin(Author of Pydantic, Pydantic Logfire, PydanticAI), and
3 more.

anthropic-sdk-python by anthropics

0.7%
2k
Python SDK for Anthropic's REST API
created 2 years ago
updated 1 day ago
Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Shawn Wang Shawn Wang(Editor of Latent Space), and
3 more.

ollama-js by ollama

0.4%
4k
JS SDK for Ollama
created 1 year ago
updated 3 weeks ago
Feedback? Help us improve.