rebuff  by protectai

SDK for LLM prompt injection detection

created 2 years ago
1,319 stars

Top 31.0% on sourcepulse

GitHubView on GitHub
Project Summary

Rebuff is an open-source prompt injection detector designed to protect AI applications from malicious inputs. It targets developers and security professionals seeking to safeguard LLM-based systems. Rebuff offers a multi-layered defense strategy to identify and mitigate prompt injection attacks.

How It Works

Rebuff employs a four-layer defense mechanism: heuristic filtering of suspicious inputs, LLM-based analysis of prompts, a vector database to store and recognize embeddings of known attacks, and canary tokens to detect data leakage. This layered approach aims to provide robust protection by combining signature-based detection with behavioral analysis and proactive monitoring.

Quick Start & Requirements

Highlighted Details

  • Detects prompt injection and canary word leakage.
  • Implements attack signature learning via a vector database.
  • Offers JavaScript/TypeScript SDK; Python SDK is planned.
  • Self-hosting requires Supabase, OpenAI, and a vector database (Pinecone/Chroma).

Maintenance & Community

  • Active development with ongoing roadmap items.
  • Discord community available: https://discord.gg/R3U2XVNKeE
  • GitHub Actions for JavaScript and Python tests indicate CI integration.

Licensing & Compatibility

  • License not explicitly stated in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Rebuff is a prototype and does not guarantee 100% protection against prompt injection attacks. A Python SDK is still under development, and features like local-only mode and user-defined detection strategies are planned.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
59 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
2 more.

llm-security by greshake

0.2%
2k
Research paper on indirect prompt injection attacks targeting app-integrated LLMs
created 2 years ago
updated 2 weeks ago
Feedback? Help us improve.