Binoculars  by ahans30

Zero-shot tool for detecting LLM-generated text

created 1 year ago
298 stars

Top 90.1% on sourcepulse

GitHubView on GitHub
Project Summary

Binoculars offers a zero-shot, domain-agnostic method for detecting AI-generated text, targeting researchers and developers needing to identify machine-written content without task-specific training. It leverages the shared pretraining data of decoder-only language models to achieve this detection.

How It Works

Binoculars operates on the principle that common pretraining datasets like Common Crawl and Pile create a predictable statistical fingerprint in LLM outputs. By analyzing deviations from this expected distribution, it can identify text likely generated by an LLM. This approach avoids the need for fine-tuning on specific datasets, making it broadly applicable.

Quick Start & Requirements

  • Install via pip: pip install -e . after cloning the repository.
  • Requires Python 3.9.
  • Demo available via python app.py.
  • Official paper and demo links provided in the README.

Highlighted Details

  • Zero-shot and domain-agnostic detection.
  • Based on shared LLM pretraining dataset overlap.
  • Provides a classification score and prediction.
  • Can process batches of text.

Maintenance & Community

The project is associated with authors from ICML 2024. Further community or maintenance details are not specified in the README.

Licensing & Compatibility

The README does not explicitly state a license. The project is marked for academic purposes only.

Limitations & Caveats

Binoculars is more proficient with English text and is intended for academic use, not as a consumer product. Users are cautioned against relying on it without human supervision.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
33 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), Douwe Kiela Douwe Kiela(Cofounder of Contextual AI), and
1 more.

lens by ContextualAI

0%
352
Vision-language research paper using LLMs
created 2 years ago
updated 1 week ago
Feedback? Help us improve.