deepteam  by confident-ai

LLM red teaming framework for security testing

Created 11 months ago
1,323 stars

Top 29.9% on SourcePulse

GitHubView on GitHub
Project Summary

DeepTeam is an open-source LLM red teaming framework designed for penetration testing and safeguarding LLM systems. It targets developers and security professionals seeking to identify and mitigate vulnerabilities like bias, PII leakage, and misinformation in chatbots, RAG pipelines, and AI agents. The framework leverages state-of-the-art adversarial attack techniques and provides guardrails for production deployment.

How It Works

DeepTeam simulates adversarial attacks against an LLM system, defined by a model_callback function. It dynamically generates attacks based on a list of specified vulnerabilities, eliminating the need for pre-defined datasets. LLMs are used for both generating these attacks and evaluating the LLM system's responses against defined vulnerability criteria. This approach allows for flexible, on-the-fly testing tailored to specific organizational needs.

Quick Start & Requirements

  • Installation: pip install -U deepteam
  • Prerequisites: Requires an OpenAI API key (or other supported providers like Anthropic, Gemini, Azure, Ollama) set as an environment variable or via deepteam set-api-key. Custom LLM integration is supported.
  • Setup: Minimal setup, primarily involves defining a model_callback function wrapping the target LLM system.
  • Documentation: LLM Red Teaming Framework Documentation

Highlighted Details

  • Supports over 40 built-in vulnerabilities (e.g., Bias, PII Leakage, Misinformation) and 10+ adversarial attack methods (e.g., Prompt Injection, Jailbreaking).
  • Allows customization of vulnerabilities and attacks with minimal code.
  • Integrates with OWASP Top 10 for LLMs and NIST AI RMF guidelines.
  • Offers a CLI for running red teaming with YAML configurations and supports custom vulnerability definitions.

Maintenance & Community

  • Developed by the founders of Confident AI.
  • Discord server available for community interaction.
  • Roadmap includes expanding vulnerability and attack coverage.

Licensing & Compatibility

  • Licensed under Apache 2.0.
  • Permits commercial use and linking with closed-source applications.

Limitations & Caveats

The framework relies on LLMs for attack generation and evaluation, which may introduce its own biases or limitations. Specific LLM provider configurations and API key management are necessary for operation.

Health Check
Last Commit

1 day ago

Responsiveness

1 week

Pull Requests (30d)
6
Issues (30d)
2
Star History
80 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.