PII de-identification SDK for text and images
Top 9.9% on sourcepulse
Presidio is an open-source SDK for detecting, redacting, masking, and anonymizing Personally Identifiable Information (PII) across text, images, and structured data. It targets developers and organizations needing to protect sensitive data, offering a customizable and extensible framework for data privacy compliance.
How It Works
Presidio employs a modular architecture, featuring an Analyzer for PII detection and an Anonymizer for data transformation. The Analyzer supports a hybrid approach, combining Named Entity Recognition (NER) models, regular expressions, rule-based logic, and checksums, with options to integrate external PII detection models. This allows for context-aware identification of sensitive entities across multiple languages.
Quick Start & Requirements
pip install presidio-analyzer presidio-anonymizer
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
While Presidio automates PII detection, it does not guarantee the identification of all sensitive information, necessitating supplementary systems for comprehensive data protection.
2 days ago
1 day