Awesome_papers_on_LLMs_detection  by Xianjun-Yang

Curated list of research papers for detecting LLM-generated content

Created 1 year ago
277 stars

Top 93.7% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a comprehensive, continuously updated collection of academic papers focused on detecting Large Language Model (LLM)-generated text and code. It is a valuable resource for researchers, developers, and practitioners in Natural Language Processing (NLP) and AI security seeking to understand and combat AI-generated content.

How It Works

The repository categorizes papers by detection methodology (e.g., training-based, zero-shot, watermarking, fingerprinting), model access (black-box vs. white-box), and specific applications like code detection or adversarial attacks. This structured approach allows users to quickly find relevant research across various facets of LLM-generated content detection.

Quick Start & Requirements

This is a curated list of papers; no software installation or execution is required. All listed papers include direct links to their PDFs or official pages.

Highlighted Details

  • Extensive coverage of recent advancements, with papers dated up to late 2024.
  • Categorization includes diverse detection techniques: contrastive learning, token prediction, watermarking, fingerprinting, and adversarial attacks.
  • Dedicated sections for datasets, code detection, and attacks against detection methods.
  • Includes links to tools like OpenAI Text Classifier and GPTZero, and benchmarks like CoAT and RAID.

Maintenance & Community

The repository is actively maintained, with frequent updates to include the latest research. The primary contributor, Xianjun Yang, has multiple cited works in the field.

Licensing & Compatibility

The repository itself is not software and does not have a license. Individual papers are subject to their respective publication licenses.

Limitations & Caveats

This is a bibliography and does not provide any implementation or tools for detection. Users must access and implement the methods described in the papers themselves.

Health Check
Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Marc Klingen Marc Klingen(Cofounder of Langfuse), and
1 more.

langkit by whylabs

0.3%
947
Open-source toolkit for monitoring LLMs
Created 2 years ago
Updated 10 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Michele Castata Michele Castata(President of Replit), and
3 more.

rebuff by protectai

0.4%
1k
SDK for LLM prompt injection detection
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.