llm-sp  by chawins

LLM security/privacy resources: papers, tools, datasets, blogs

created 1 year ago
520 stars

Top 61.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository compiles papers and resources on the security and privacy of Large Language Models (LLMs). It serves as a curated reference for researchers and practitioners entering or working within this nascent field, offering quick access to key studies on vulnerabilities, attacks, and defenses.

How It Works

The collection is organized by primary contribution, covering areas like prompt injection, jailbreaking, adversarial attacks, privacy concerns (data extraction, membership inference), watermarking, and LLM security in general. Papers are tagged with symbols indicating their focus (e.g., ⭐ for personally recommended, 💽 for datasets, 👤 for PII focus). The curator emphasizes that ⭐ is a subjective indicator of personal understanding and enjoyment, not a measure of paper quality.

Quick Start & Requirements

  • Access: The primary resource is the GitHub repository itself, with a Notion page serving as a more frequently updated source.
  • Requirements: No specific software installation is required to browse the curated list of papers.

Highlighted Details

  • Extensive coverage of prompt injection techniques, including indirect methods and RCE vulnerabilities in LLM-integrated applications.
  • Detailed categorization of jailbreak methods, from simple prompt manipulation to complex automated attacks and persona modulation.
  • In-depth exploration of privacy risks, including data extraction, membership inference, and PII leakage, with a focus on empirical studies and attack methodologies.
  • Analysis of adversarial attacks, including white-box, black-box, and transfer attacks, with a growing focus on post-BERT era techniques.
  • Resources on watermarking LLM outputs for detection and defenses against various adversarial manipulations.

Maintenance & Community

The repository is maintained by the author, with contributions welcomed via GitHub issues or pull requests. The author also manually transfers updates to a Notion page.

Licensing & Compatibility

The repository itself is likely under a permissive license (e.g., MIT, Apache 2.0) given its nature as a curated list of research papers. The licensing of the individual papers referenced would vary by their original publication.

Limitations & Caveats

The paper selection is noted to be biased towards the curator's research interests, potentially leading to an incomplete overview. Distinctions between prompt injection, jailbreaking, and adversarial attacks can be fluid, and some papers may fit into multiple categories.

Health Check
Last commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
24 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.