detextify  by iuliaturc

Python library for removing text artifacts from AI-generated images

Created 2 years ago
296 stars

Top 89.5% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Detextify is a Python library designed to remove unwanted text artifacts from images generated by AI models like Stable Diffusion, Midjourney, and DALL-E. It targets users and developers working with generative AI who need to improve the quality and usability of AI-generated images by cleaning up common text-based distortions.

How It Works

The library employs a two-stage process: text detection and in-painting. It first identifies and masks text regions within an image using a chosen text detection method (e.g., Tesseract or Azure Computer Vision API). Subsequently, it uses an in-painting model (e.g., Stable Diffusion, OpenAI DALL-E 2, or Replicate API) to fill in the masked areas, effectively removing the text and reconstructing the background. This modular approach allows flexibility in choosing between local processing or API-based services for both detection and in-painting.

Quick Start & Requirements

  • Install via pip: pip install detextify
  • Local execution requires Tesseract OCR for text detection.
  • Local in-painting requires a GPU with CUDA and cuDNN installed, defaulting to Stable Diffusion v2.
  • API-based options require relevant API keys (Azure, OpenAI, Replicate).
  • See the Colab notebook for usage examples.

Highlighted Details

  • Supports multiple text detection backends: Tesseract (local) and Azure Computer Vision (API).
  • Offers various in-painting options: Local Stable Diffusion (GPU required), Replicate API, and OpenAI DALL-E 2 API.
  • Provides a clear Python API for programmatic use and batch processing of images.
  • Designed with extensibility to add custom text detectors and in-painters.

Maintenance & Community

Authored by Mihail Eric and Julia Turc. The project encourages contributions via pull requests after cloning and setting up with Poetry.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Local in-painting is computationally intensive and requires specific GPU hardware and CUDA setup. The effectiveness of text removal depends on the quality of the text detection and the in-painting model's ability to reconstruct the background.

Health Check
Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Robin Rombach Robin Rombach(Cofounder of Black Forest Labs), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
2 more.

Kandinsky-2 by ai-forever

0.0%
3k
Multilingual text-to-image latent diffusion model
Created 2 years ago
Updated 1 year ago
Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
7 more.

glide-text2im by openai

0.1%
4k
Text-conditional image synthesis model from research paper
Created 3 years ago
Updated 1 year ago
Starred by Dan Abramov Dan Abramov(Core Contributor to React; Coauthor of Redux, Create React App), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
57 more.

stable-diffusion by CompVis

0.1%
71k
Latent text-to-image diffusion model
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.