Binoculars  by ahans30

Zero-shot tool for detecting LLM-generated text

Created 1 year ago
306 stars

Top 87.5% on SourcePulse

GitHubView on GitHub
Project Summary

Binoculars offers a zero-shot, domain-agnostic method for detecting AI-generated text, targeting researchers and developers needing to identify machine-written content without task-specific training. It leverages the shared pretraining data of decoder-only language models to achieve this detection.

How It Works

Binoculars operates on the principle that common pretraining datasets like Common Crawl and Pile create a predictable statistical fingerprint in LLM outputs. By analyzing deviations from this expected distribution, it can identify text likely generated by an LLM. This approach avoids the need for fine-tuning on specific datasets, making it broadly applicable.

Quick Start & Requirements

  • Install via pip: pip install -e . after cloning the repository.
  • Requires Python 3.9.
  • Demo available via python app.py.
  • Official paper and demo links provided in the README.

Highlighted Details

  • Zero-shot and domain-agnostic detection.
  • Based on shared LLM pretraining dataset overlap.
  • Provides a classification score and prediction.
  • Can process batches of text.

Maintenance & Community

The project is associated with authors from ICML 2024. Further community or maintenance details are not specified in the README.

Licensing & Compatibility

The README does not explicitly state a license. The project is marked for academic purposes only.

Limitations & Caveats

Binoculars is more proficient with English text and is intended for academic use, not as a consumer product. Users are cautioned against relying on it without human supervision.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
1 more.

LMaaS-Papers by txsun1997

0%
549
Curated list of LMaaS research papers
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.