Discover and explore top open-source AI tools and projects—updated daily.
Mega-GorillaLocal PDF translator preserving document layout
Top 89.1% on SourcePulse
Summary
This project provides a local command-line tool, formerly a web service, for translating academic PDFs while preserving original formatting. It addresses the challenge of accurately translating complex documents by intelligently identifying and processing text blocks, making it beneficial for researchers and academics needing to understand foreign-language papers.
How It Works
The tool leverages PyMuPDF for robust text and coordinate extraction from PDFs. It employs spaCy for natural language processing to automatically identify and classify text blocks, distinguishing between main body text, figure/table captions, and elements to ignore. This classification informs a novel cross-block translation approach that merges fragmented sentences across block and page boundaries to maintain contextual integrity. The processed text is then translated using pluggable backends (Google, DeepL, OpenAI) and re-inserted into a new PDF, optionally generating a side-by-side comparison.
Quick Start & Requirements
Requires Python 3.11+. Installation is via uv sync or pip install -r requirements.txt, followed by downloading spaCy language models (en_core_web_sm, ja_core_news_sm). Google Translate is the default backend and requires no API key. DeepL and OpenAI backends necessitate API keys and potentially additional package installations (index-pdf-translation[deepl], index-pdf-translation[openai]).
Highlighted Details
Maintenance & Community
The provided README does not contain specific details regarding maintainers, community channels (like Discord/Slack), or project roadmaps.
Licensing & Compatibility
Licensed under GNU Affero General Public License v3.0 (AGPL-3.0). This strong copyleft license requires that any modifications or derivative works distributed must also be made available under the AGPL-3.0. Compatibility with closed-source projects may be restricted due to its viral nature.
Limitations & Caveats
The tool cannot process scanned PDFs that lack an OCR layer, as text extraction will fail. Complex PDF layouts may lead to inaccurate block classification or text insertion issues. While the debug mode aids analysis, resolving intricate layout problems might require manual intervention.
3 weeks ago
Inactive