vigil-llm by deadbits

LLM security scanner (Python library/REST API) for prompt injection/jailbreak detection

Created 2 years ago

435 stars

Top 68.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Dan Guido

Cofounder of Trail of Bits

Project Summary

Vigil is a Python library and REST API designed to detect prompt injections, jailbreaks, and other security threats in Large Language Model (LLM) inputs and outputs. It targets developers and researchers seeking to enhance LLM security by providing a modular, extensible framework for analyzing potentially malicious prompts.

How It Works

Vigil employs a multi-layered detection approach, integrating several scanning modules. These include vector database similarity searches against known attack patterns, heuristic analysis using YARA rules, transformer model-based classification, prompt-response similarity checks, and canary tokens for detecting prompt leakage or goal hijacking. This layered strategy aims to provide robust defense against known LLM vulnerabilities, acknowledging that a 100% foolproof solution is not yet achievable.

Quick Start & Requirements

Install: Clone the repository, install YARA v4.3.2, set up a Python virtual environment, and install Vigil using pip install -e ..
Prerequisites: Python 3.x, YARA v4.3.2. Configuration requires editing conf/server.conf. Loading datasets for the vector database scanner is optional but recommended for its functionality.
Resources: Official documentation and release blog are linked in the README.

Highlighted Details

Supports local embeddings or OpenAI for vector database similarity.
Offers custom detection capabilities via YARA signatures.
Includes a Streamlit web UI for interactive playground testing.
Provides Canary Tokens for specific detection workflows.

Maintenance & Community

The project is maintained by deadbits. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Vigil is currently in an alpha state and considered experimental, intended for research purposes. The README explicitly states that prompt injection attacks are currently unsolvable and Vigil should not be the sole defense mechanism.

Health Check

Last Commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days