Discover and explore top open-source AI tools and projects—updated daily.
vaultmcpRuntime security proxy for AI agents
New!
Top 53.0% on SourcePulse
Vault is a production-grade prompt-injection firewall designed to protect AI agents by scanning tool responses before they reach the agent's context. It addresses the critical security vulnerability of malicious LLM outputs, safeguarding against data exfiltration, unauthorized actions, and agent compromise. Targeted at engineers and researchers deploying AI agents in production, Vault offers a robust, layered defense mechanism with minimal impact on agent workflows.
How It Works
Vault employs a three-layer detection pipeline. Layer 1 uses fast regex and heuristic pattern matching for common injection techniques. Layer 2 leverages on-device embedding similarity (bge-small) against a curated corpus of known attacks. Borderline cases are escalated to Layer 3, an LLM judge (e.g., Anthropic Haiku 4.5), which resolves ambiguities. This layered approach provides high detection rates (100% TPR on specific benchmarks) while minimizing LLM costs, as Layer 3 is invoked only for uncertain L2 cases (typically <5% of requests).
Quick Start & Requirements
npx @aimcpvault/mcp-proxy (no global install required).claude-haiku-4-5-20251001) via ANTHROPIC_API_KEY.gpt-4o-mini, self-hosted vLLM/llama.cpp) via OPENAI_API_KEY.OLLAMA_HOST=http://localhost:11434.Highlighted Details
Maintenance & Community
The project is actively developed, with plans for v0.3 including a continuous attestation feed, public reputation registry, and signed releases. Community interaction is primarily via X (@vaultmcp).
Licensing & Compatibility
Limitations & Caveats
Vault is text-only and does not scan binary content (images, audio, PDFs) within tool responses. Multi-turn attacks spanning different sessions and user-initiated jailbreaks are outside its scope. In offline mode (L1+L2 only), TPR drops significantly, especially for novel, out-of-distribution attacks. Protocol-encoded data without L3 enabled may result in higher false positive rates.
2 weeks ago
Inactive