manga-image-translator  by zyddnys

Image translator for manga/images, supporting multiple languages

created 4 years ago
8,217 stars

Top 6.4% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides an automated solution for translating text within images, primarily targeting manga. It supports Japanese, Chinese, English, and Korean, offering features like inpainting, text rendering, and colorization, making it valuable for manga scanlators and enthusiasts.

How It Works

The system employs a pipeline approach, first detecting text regions using various OCR models (e.g., ctd, craft). It then extracts the text, translates it using a selection of online or offline translation services (including DeepL, Google, Sugoi, and M2M100), and finally renders the translated text back onto the image, often with inpainting to cover original text. Advanced options allow for upscaling, font customization, and glossary integration for improved translation accuracy.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt (Python >= 3.10 recommended).
  • GPU Support: Requires PyTorch with CUDA support for GPU acceleration. Instructions available at pytorch.org.
  • Docker: A large (~15GB) Docker image is available for easier setup and GPU acceleration (--gpus=all).
  • Dependencies: Microsoft C++ Build Tools may be needed on Windows.
  • Docs: Official Demo, Userscript, API Docs.

Highlighted Details

  • Supports multiple OCR detectors and translation engines, including offline options like Sugoi and M2M100.
  • Features inpainting (e.g., lama_large) to seamlessly cover original text.
  • Offers advanced rendering options, including font path specification and upscaling for small text.
  • Provides CLI, Web UI, and API modes for flexible usage.

Maintenance & Community

  • Active development with a Discord community available at discord.gg/Ak8APNy4vb.
  • Project is a successor to MMDOCR-HighPerformance.

Licensing & Compatibility

  • The specific license is not explicitly stated in the README, but the project is described as a "hobby project" seeking contributions. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

  • The text rendering engine is noted as "barely usable" and primarily detects text lines rather than speech bubbles, posing challenges for placement.
  • The online demo may experience downtime due to instance restarts.
  • GIMP integration for certain output formats has limitations with rotated text.
Health Check
Last commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
12
Issues (30d)
25
Star History
828 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.