detextify by iuliaturc

Python library for removing text artifacts from AI-generated images

Created 3 years ago

303 stars

Top 88.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Shawn Wang

Editor of Latent Space

Project Summary

Detextify is a Python library designed to remove unwanted text artifacts from images generated by AI models like Stable Diffusion, Midjourney, and DALL-E. It targets users and developers working with generative AI who need to improve the quality and usability of AI-generated images by cleaning up common text-based distortions.

How It Works

The library employs a two-stage process: text detection and in-painting. It first identifies and masks text regions within an image using a chosen text detection method (e.g., Tesseract or Azure Computer Vision API). Subsequently, it uses an in-painting model (e.g., Stable Diffusion, OpenAI DALL-E 2, or Replicate API) to fill in the masked areas, effectively removing the text and reconstructing the background. This modular approach allows flexibility in choosing between local processing or API-based services for both detection and in-painting.

Quick Start & Requirements

Install via pip: pip install detextify
Local execution requires Tesseract OCR for text detection.
Local in-painting requires a GPU with CUDA and cuDNN installed, defaulting to Stable Diffusion v2.
API-based options require relevant API keys (Azure, OpenAI, Replicate).
See the Colab notebook for usage examples.

Highlighted Details

Supports multiple text detection backends: Tesseract (local) and Azure Computer Vision (API).
Offers various in-painting options: Local Stable Diffusion (GPU required), Replicate API, and OpenAI DALL-E 2 API.
Provides a clear Python API for programmatic use and batch processing of images.
Designed with extensibility to add custom text detectors and in-painters.

Maintenance & Community

Authored by Mihail Eric and Julia Turc. The project encourages contributions via pull requests after cloning and setting up with Poetry.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Local in-painting is computationally intensive and requires specific GPU hardware and CUDA setup. The effectiveness of text removal depends on the quality of the text detection and the in-painting model's ability to reconstruct the background.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days