PDFMathTranslate  by Byaidu

CLI tool for PDF scientific paper translation, preserving format

created 11 months ago
26,046 stars

Top 1.6% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a solution for translating scientific PDFs while preserving complex formatting, including formulas, charts, and tables of contents. It targets researchers, academics, and anyone needing to understand scientific literature in different languages, offering a significant benefit by making foreign-language papers accessible without losing critical structural information.

How It Works

The core of the system leverages AI-driven layout analysis, specifically mentioning DocLayout-YOLO, to parse and understand the structure of scientific documents. This allows it to extract text, identify elements like formulas and tables, and then reassemble them with translated content while maintaining the original layout. It supports multiple translation services (Google, DeepL, Ollama, OpenAI) and offers flexibility in how users interact with the tool.

Quick Start & Requirements

  • CLI/GUI/Docker:
    • CLI/GUI: Requires Python 3.10-3.12. Install via pip install pdf2zh or uv tool install --python 3.12 pdf2zh. Run with pdf2zh document.pdf or pdf2zh -i for GUI.
    • Docker: docker pull byaidu/pdf2zh and run docker run -d -p 7860:7860 byaidu/pdf2zh.
  • Dependencies: An AI model (wybxc/DocLayout-YOLO-DocStructBench-onnx) is required; network issues can be mitigated with HF_ENDPOINT=https://hf-mirror.com. Windows users may need vc_redist.x64.exe.
  • Online Demos: Available via Immersive Translate - BabelDOC (1000 free pages/month), HuggingFace, and ModelScope.

Highlighted Details

  • Preserves formulas, charts, table of contents, and annotations.
  • Supports multiple translation services including local Ollama and OpenAI Azure.
  • Offers CLI, GUI (web-based via Gradio), Docker, and Zotero plugin integrations.
  • Experimental support for BabelDOC backend and non-PDF/A documents.

Maintenance & Community

Recent updates include experimental BabelDOC support, improved Windows executables, and local model integration via Xinference. Community interaction is encouraged via GitHub Issues and a Telegram Group.

Licensing & Compatibility

The project appears to be primarily distributed under a permissive license, though specific details are not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require a review of the exact license file.

Limitations & Caveats

The project is under active development with several items listed in the TODOs, including improving layout parsing with other models, fixing specific formatting issues (page rotation, lists), and supporting non-PDF/A files. Some features like annotation translation are marked as "preview."

Health Check
Last commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
4
Issues (30d)
11
Star History
3,879 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.