PDF tool for layout-preserving translation
Top 22.0% on sourcepulse
PolyglotPDF is a multilingual eBook processing tool designed for efficient, layout-preserving translation of PDF documents. It targets users needing to translate scanned or digital PDFs, offering both online and offline translation capabilities with a focus on maintaining original formatting and achieving high performance.
How It Works
The tool leverages PyMuPDF for direct text block recognition and manipulation, enabling ultra-fast processing of text, tables, and formulas within PDFs, often under 1 second per page. This approach prioritizes performance and cost-effectiveness by avoiding computationally intensive AI-based formula recognition or page restructuring, focusing instead on accurate text extraction and translation while preserving layout. LLMs are now the primary translation API, with recommendations for models like Doubao, Qwen, Deepseek v3, and GPT-4-o-mini.
Quick Start & Requirements
pip install -r requirements.txt
, configure API keys in config.json
, and run with python app.py
.docker pull 2207397265/polyglotpdf:latest
. Quick start without persistence or with persistent storage using volume mounts is detailed.http://127.0.0.1:8000
(or http://localhost:12226
for Docker).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The tool may encounter issues with unsupported color spaces during text re-editing, with a proposed workaround involving OCR for affected pages. Complex vector mathematical formulas in tables are not correctly handled. The project is described as a demonstration of layout-preserved translation and AI-assisted reading, not a comprehensive PDF editor.
1 month ago
1 day