vigil-llm  by deadbits

LLM security scanner (Python library/REST API) for prompt injection/jailbreak detection

created 1 year ago
400 stars

Top 73.4% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Vigil is a Python library and REST API designed to detect prompt injections, jailbreaks, and other security threats in Large Language Model (LLM) inputs and outputs. It targets developers and researchers seeking to enhance LLM security by providing a modular, extensible framework for analyzing potentially malicious prompts.

How It Works

Vigil employs a multi-layered detection approach, integrating several scanning modules. These include vector database similarity searches against known attack patterns, heuristic analysis using YARA rules, transformer model-based classification, prompt-response similarity checks, and canary tokens for detecting prompt leakage or goal hijacking. This layered strategy aims to provide robust defense against known LLM vulnerabilities, acknowledging that a 100% foolproof solution is not yet achievable.

Quick Start & Requirements

  • Install: Clone the repository, install YARA v4.3.2, set up a Python virtual environment, and install Vigil using pip install -e ..
  • Prerequisites: Python 3.x, YARA v4.3.2. Configuration requires editing conf/server.conf. Loading datasets for the vector database scanner is optional but recommended for its functionality.
  • Resources: Official documentation and release blog are linked in the README.

Highlighted Details

  • Supports local embeddings or OpenAI for vector database similarity.
  • Offers custom detection capabilities via YARA signatures.
  • Includes a Streamlit web UI for interactive playground testing.
  • Provides Canary Tokens for specific detection workflows.

Maintenance & Community

The project is maintained by deadbits. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Vigil is currently in an alpha state and considered experimental, intended for research purposes. The README explicitly states that prompt injection attacks are currently unsolvable and Vigil should not be the sole defense mechanism.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
22 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Michele Castata Michele Castata(President of Replit), and
2 more.

rebuff by protectai

0.4%
1k
SDK for LLM prompt injection detection
created 2 years ago
updated 1 year ago
Feedback? Help us improve.