ollama_proxy_server  by ParisNeo

Proxy server for load balancing and securing Ollama instances

Created 1 year ago
489 stars

Top 63.1% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides a lightweight, secure proxy server for managing multiple Ollama instances, aimed at developers and users needing to scale their LLM deployments with enhanced security and load balancing. It offers improved responsiveness and centralized management for distributed Ollama backends.

How It Works

The proxy server employs a load-balancing strategy, routing incoming requests to the backend Ollama instance with the fewest active connections. It implements bearer token authentication for security and utilizes asynchronous logging to a CSV file without impacting performance. Connection pooling and proper forwarding of streaming responses are key architectural choices for efficient and responsive LLM interactions.

Quick Start & Requirements

  • Install from Source:
    git clone https://github.com/ParisNeo/ollama_proxy_server.git
    cd ollama_proxy_server
    pip install -r requirements.txt
    pip install .
    
  • Prerequisites: Python 3.11 or higher.
  • Docker: Available via docker build -t ollama_proxy_server . and docker run -p 8080:8080 -v $(pwd)/config.ini:/app/config.ini -v $(pwd)/authorized_users.txt:/app/authorized_users.txt ollama_proxy_server.
  • Configuration: Requires config.ini for backend URLs and authorized_users.txt for credentials.
  • Usage: Run with python main.py --config config.ini --users_list authorized_users.txt.
  • User Management: Use python add_user.py <username> <key>.
  • Docs: Repository

Highlighted Details

  • Load balancing across multiple Ollama servers.
  • Bearer token authentication with user:key format.
  • Asynchronous logging to CSV.
  • Connection pooling for faster backend communication.
  • Streaming support for Ollama responses.

Maintenance & Community

  • Developed by ParisNeo.
  • Contributions are welcome via pull requests.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatible with Python 3.11+.

Limitations & Caveats

The project is not yet published on PyPI. The CONTRIBUTING.md file is noted as "to be added."

Health Check
Last Commit

1 week ago

Responsiveness

1 week

Pull Requests (30d)
3
Issues (30d)
5
Star History
12 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.