Saber-Translator  by MashiroSaber03

AI tool for translating and editing manga, supporting multiple models

Created 7 months ago
1,797 stars

Top 23.9% on SourcePulse

GitHubView on GitHub
Project Summary

Saber-Translator is an AI-powered tool designed for manga enthusiasts to overcome language barriers, offering automated translation and editing for manga images and PDFs. It targets users who want to read Japanese manga in Chinese, providing a comprehensive solution from text detection to background repair and customizable rendering.

How It Works

The tool employs a multi-stage AI pipeline. It starts with YOLOv5 for detecting text regions within manga panels, followed by OCR engines (Manga OCR for Japanese, Paddle OCR for other languages) to extract text. Translation is handled by various cloud services (SiliconFlow, DeepSeek, Volcano Engine, Caiyun) or local LLMs (Ollama, Sakura), with options for API key management. For seamless integration, it features background inpainting using LAMA or MI-GAN models and offers advanced text rendering customization.

Quick Start & Requirements

  • Download the latest release executable for your OS from the Releases page.
  • Run the executable; it typically opens a web interface at http://127.0.0.1:5000/.
  • Supports image (JPG, PNG) and PDF formats.
  • Backend: Python 3.10+, PyTorch, ONNX Runtime, PaddlePaddle-OCR. Frontend: HTML5, CSS3, JavaScript.

Highlighted Details

  • Supports manual annotation for text box correction and manual editing of translated text and styles.
  • Features session management to save and load work progress.
  • Includes a plugin system for extending functionality.
  • Offers AI-powered background repair using LAMA for a clean reading experience.

Maintenance & Community

The project is actively maintained, with a roadmap indicating future enhancements like more AI service integrations and improved layout logic. Sponsorships are accepted via WeChat and Alipay. Contributions are welcomed via Pull Requests and Issues.

Licensing & Compatibility

The project is available under a permissive license, suitable for commercial use and integration with closed-source projects. However, the README includes a disclaimer stating the tool is primarily for learning and technical exchange, and prohibits illegal or commercial use.

Limitations & Caveats

The project relies on third-party AI services, whose quality, availability, and cost are determined by the providers. AI-generated translations may contain errors or inaccuracies, and the project disclaims responsibility for translation quality. The MI-GAN background repair is noted as having generally average results compared to LAMA.

Health Check
Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
8
Star History
194 stars in the last 30 days

Explore Similar Projects

Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
1 more.

METER by zdou0830

0%
373
Multimodal framework for vision-and-language transformer research
Created 3 years ago
Updated 2 years ago
Feedback? Help us improve.