pasteguard  by sgasser

Privacy proxy for LLMs masking PII and secrets

Created 2 weeks ago

New!

462 stars

Top 65.7% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

PasteGuard is an OpenAI-compatible privacy proxy designed to protect sensitive data when interacting with Large Language Models (LLMs). It addresses the critical need for data privacy by masking Personally Identifiable Information (PII) and secrets before they are sent to external LLM providers or by routing requests to local LLMs. This solution is ideal for developers and organizations needing to comply with data protection policies while leveraging powerful AI services.

How It Works

The core of PasteGuard is its role as an intermediary. It intercepts LLM API requests, analyzes them for PII (names, emails, phone numbers, etc.) and secrets (API keys, private keys) using Microsoft Presidio, and then applies one of two protection strategies. In "Mask Mode," detected sensitive data is replaced with placeholders before the request is forwarded to the LLM provider, with restoration occurring upon response. Alternatively, "Route Mode" directs requests containing PII to a locally hosted LLM (e.g., Ollama, vLLM), ensuring sensitive data never leaves the user's network. This proxy architecture allows seamless integration by simply changing the API endpoint URL.

Quick Start & Requirements

  • Installation: Clone the repository (git clone https://github.com/sgasser/pasteguard.git), navigate into the directory, copy the example configuration (cp config.example.yaml config.yaml), and run docker compose up -d.
  • Configuration: Modify config.yaml for specific settings.
  • Endpoint: Point your application to http://localhost:3000/openai/v1 instead of the original LLM API endpoint.
  • Dashboard: Access the real-time monitoring dashboard at http://localhost:3000/dashboard.
  • Prerequisites: Docker and Docker Compose are required.
  • Documentation: Official Documentation and Integrations links are available.

Highlighted Details

  • Comprehensive Detection: Identifies a wide range of PII (names, emails, phone numbers, credit cards, IBANs, IP addresses, locations) and secrets (API keys, private keys, tokens) via Microsoft Presidio.
  • Multi-language Support: Operates effectively across 24 languages.
  • Broad Compatibility: Fully OpenAI-compatible, working seamlessly with SDKs (Python, JS), LangChain, LlamaIndex, Cursor, and other OpenAI-compatible tools.
  • Real-time Features: Supports streaming requests and responses, with an integrated dashboard for monitoring protected requests.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or project roadmap were found in the provided README.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license suitable for commercial use and integration into closed-source applications. Fully compatible with any OpenAI-compatible tool.

Limitations & Caveats

The provided README does not detail specific limitations, known bugs, or alpha/beta status. The project appears to be presented as a stable, production-ready solution.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
41
Issues (30d)
18
Star History
467 stars in the last 19 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Ishaan Jaffer Ishaan Jaffer(Cofounder of LiteLLM).

llm-gateway by wealthsimple

0%
251
Secure LLM gateway for multiple providers
Created 2 years ago
Updated 6 months ago
Feedback? Help us improve.