detoxify  by unitaryai

Trained models for toxic comment classification

Created 5 years ago
1,107 stars

Top 34.5% on SourcePulse

GitHubView on GitHub
Project Summary

Detoxify provides pre-trained models and code for classifying toxic comments across multiple datasets and languages. It is designed for researchers and developers working on content moderation, bias detection, and natural language understanding, offering a user-friendly interface to identify various forms of toxicity in text.

How It Works

The library leverages state-of-the-art transformer models (BERT, RoBERTa, XLM-RoBERTa) fine-tuned on Jigsaw's toxic comment datasets. It employs PyTorch Lightning for efficient training and Hugging Face Transformers for model architecture and tokenization. This approach allows for high performance and broad language support, with specific models optimized for general toxicity, unintended bias, and multilingual classification.

Quick Start & Requirements

  • Install via pip: pip install detoxify
  • For inference: PyTorch, Transformers. For training: Kaggle API, pandas.
  • Models can be loaded directly from PyTorch Hub or checkpoints.
  • Supports CPU and CUDA devices.
  • Example usage and detailed prediction/training scripts are available.

Highlighted Details

  • Achieved high AUC scores on Jigsaw challenges (e.g., 93.74% for unbiased, 92.11% for multilingual).
  • Offers smaller, lightweight models (e.g., original-small, unbiased-small) for reduced resource usage.
  • Multilingual model supports English, French, Spanish, Italian, Portuguese, Turkish, and Russian.
  • Includes detailed explanations of toxicity labels and ethical considerations regarding potential biases.

Maintenance & Community

  • Developed by Laura Hanu at Unitary.
  • Active development with recent updates in October 2021.
  • Codebase includes CI/CD pipelines for testing and linting.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README. However, the presence of a LICENSE file (not detailed here) is typical for open-source projects. Users should verify the license for commercial use or closed-source integration.

Limitations & Caveats

  • Models may misclassify humorous or self-deprecating use of profanity as toxic.
  • Potential biases towards vulnerable minority groups exist, as noted by the developers.
  • Intended for research or aiding content moderators; fine-tuning on specific datasets is recommended.
Health Check
Last Commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
18 stars in the last 30 days

Explore Similar Projects

Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), François Chollet François Chollet(Author of Keras; Cofounder of Ndea, ARC Prize), and
42 more.

spaCy by explosion

0.1%
32k
NLP library for production applications
Created 11 years ago
Updated 3 months ago
Feedback? Help us improve.