fast-detect-gpt  by baoguangsheng

Zero-shot machine-generated text detection via conditional probability curvature

created 2 years ago
324 stars

Top 85.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the code for "Fast-DetectGPT," an efficient zero-shot method for detecting machine-generated text. It targets researchers and developers working on AI safety, content authenticity, and natural language processing, offering a significantly faster and more accurate alternative to existing methods like DetectGPT.

How It Works

Fast-DetectGPT leverages the concept of "conditional probability curvature" to distinguish between human-written and AI-generated text. Unlike DetectGPT, which relies on extensive sampling from a language model, Fast-DetectGPT uses a more efficient approach by analyzing the curvature of conditional probabilities. This method achieves a substantial speedup (up to 340x) while improving detection accuracy (AUROC up to 0.9887).

Quick Start & Requirements

  • Install via setup.sh.
  • Requires Python 3.8, PyTorch 1.10.0.
  • Experiments were run on a Tesla A100 GPU with 80GB memory.
  • Local demo: python scripts/local_infer.py (default models: gpt-neo-2.7B).
  • For more accurate detections, use python scripts/local_infer.py --sampling_model_name gpt-j-6B.
  • Local Demo

Highlighted Details

  • Achieves 340x speedup and 74.7% relative accuracy improvement over DetectGPT for detecting 5-model generations.
  • Achieves 340x speedup and 76.1% relative accuracy improvement over DetectGPT for detecting ChatGPT/GPT-4 generations.
  • Supports proprietary models like GPT-3.5 via Glimpse API.
  • Falcon-7b/falcon-7b-instruct identified as optimal sampling/scoring models.

Maintenance & Community

The project is associated with the ICLR 2024 paper. No specific community channels or active maintenance signals are provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. It mentions borrowing code from DetectGPT, whose license should be considered. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project's experimental setup relies on specific hardware (Tesla A100 GPU with 80GB memory), and performance may vary on different configurations. The licensing status is unclear, potentially impacting commercial adoption.

Health Check
Last commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
38 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.