LLM security scanner (Python library/REST API) for prompt injection/jailbreak detection
Top 73.4% on sourcepulse
Vigil is a Python library and REST API designed to detect prompt injections, jailbreaks, and other security threats in Large Language Model (LLM) inputs and outputs. It targets developers and researchers seeking to enhance LLM security by providing a modular, extensible framework for analyzing potentially malicious prompts.
How It Works
Vigil employs a multi-layered detection approach, integrating several scanning modules. These include vector database similarity searches against known attack patterns, heuristic analysis using YARA rules, transformer model-based classification, prompt-response similarity checks, and canary tokens for detecting prompt leakage or goal hijacking. This layered strategy aims to provide robust defense against known LLM vulnerabilities, acknowledging that a 100% foolproof solution is not yet achievable.
Quick Start & Requirements
pip install -e .
.conf/server.conf
. Loading datasets for the vector database scanner is optional but recommended for its functionality.Highlighted Details
Maintenance & Community
The project is maintained by deadbits. Further community or roadmap information is not detailed in the README.
Licensing & Compatibility
The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
Vigil is currently in an alpha state and considered experimental, intended for research purposes. The README explicitly states that prompt injection attacks are currently unsolvable and Vigil should not be the sole defense mechanism.
1 year ago
Inactive