Chinese-offensive-language-detect by royal12646

Chinese offensive language detection

Created 1 year ago

866 stars

Top 40.7% on SourcePulse

Project Summary

This project provides a system for detecting six categories of harmful Chinese text: sexually explicit, abusive (including phonetic variations), and discriminatory content based on region, gender, race, or occupation. It targets NLP researchers and developers aiming to build more robust and safer Chinese language processing systems, offering a dataset and fine-tuned models for this purpose.

How It Works

The system employs a multi-stage approach. First, it generates a comprehensive dataset of harmful and safe text pairs using LLMs, keywords, and "model jailbreaking" techniques to prevent keyword overfitting. It then fine-tunes the hfl/chinese-macbert-base model into three specialized detectors: one for sexually explicit content, one for abusive language, and one for bias-related discrimination. Phonetic variations are handled by converting text to pinyin before analysis. Finally, an ensemble method combines the outputs of these specialized detectors using learnable weights to produce a general-purpose offensive language detector.

Quick Start & Requirements

Environment Setup: Create a conda environment (conda create -n off-detect python=3.9), activate it (conda activate off-detect), and install dependencies (pip install requirements.txt).
Training: Run python train.py.
Testing: Execute python test.py.
Demo: Navigate to ./Chinese-offensive-language-detect/Demo, allow TCP port 5000 (sudo ufw allow 5000/tcp), and run the Flask app (python Flask/app.py). A separate terminal is needed to activate the environment, navigate to ./Chinese-offensive-language-detect/Demo/User, and run node procedure.js. The frontend can be started with npm run dev from the ./Chinese-offensive-language-detect/Demo/ directory.
Prerequisites: Python 3.9.

Highlighted Details

Detects six types of harmful content: sexually explicit, abusive (including phonetic), regional, gender, racial, and occupational discrimination.
Utilizes LLM-generated datasets with keyword-based generation and "model jailbreaking" to enhance robustness.
Employs an ensemble learning approach with learnable weights for a generalized detection model.
Handles phonetic offensive language by converting text to pinyin.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or roadmaps are provided in the README.

Licensing & Compatibility

The code's license is not explicitly stated. The associated dataset, "ZHateBench: A Comprehensive Chinese Offensive Language Dataset with Harmful–Safe Pairs," is available via Zenodo (DOI: 10.5281/zenodo.16812052), typically implying an open-access license for research purposes. Compatibility for commercial use or linking with closed-source projects is not specified.

Limitations & Caveats

The project focuses exclusively on the Chinese language. While the dataset is AI-generated to cover various harms, its inherent biases or potential gaps compared to real-world offensive language are not detailed. The README does not mention specific performance benchmarks or known limitations of the detection models.

Chinese-offensive-language-detect by royal12646

Explore Similar Projects

text-humanizer by anasu1

korean_unsmile_dataset by smilegate-ai

Portuguese-NLP by ajdavidl

AIGC_text_detector by YuchuanTian

Qwen2-Boundless by ystemsrx

awsome-vietnamese-nlp by vndee

text-humanizer by khrisat

chatgpt-comparison-detection by Hello-SimpleAI

detoxify by unitaryai

The-NLP-Pandect by ivan-bilan

underthesea by undertheseanlp

text_classification by brightmart