Discover and explore top open-source AI tools and projects—updated daily.
Jiaqi-Chen-00Detecting machine-revised text with style optimization
Top 100.0% on SourcePulse
Machine-revised text detection is a challenging problem due to subtle stylistic changes. The ImBD framework offers a novel solution by aligning machine stylistic preferences, enabling state-of-the-art detection performance for revisions made by LLMs like GPT-3.5 and GPT-4o. This project is targeted at researchers and practitioners needing to identify AI-generated or modified content efficiently, even with limited training data.
How It Works
The core of ImBD lies in its Style Preference Optimization (SPO) and Style-CPC components, designed to effectively capture machine-style phrasing. This approach excels at identifying subtle stylistic nuances that differentiate human-originated text from machine-revised content, offering an advantage in accuracy and efficiency over traditional methods.
Quick Start & Requirements
conda create -n ImBD python=3.10), activate it (conda activate ImBD), and install dependencies (pip install -r requirements.txt).Highlighted Details
Maintenance & Community
The provided README does not contain specific details regarding maintainers, community channels (like Discord/Slack), or a public roadmap.
Licensing & Compatibility
The provided README does not specify the project's license or any compatibility notes for commercial use.
Limitations & Caveats
The project lists several "TODO" items, including the implementation of inference code specifically for detection, optimization of trained model preservation, and further optimization of GPU memory usage for evaluation scripts. The current inference checkpoint contains only LoRA weights, necessitating the separate download of the base model.
9 months ago
Inactive
princeton-nlp
uber-research