arxiv-translator  by Leey21

Automated arXiv paper translation to Chinese PDF

Created 2 months ago
284 stars

Top 92.0% on SourcePulse

GitHubView on GitHub
Project Summary

This project addresses the challenge of accessing and understanding arXiv research papers for non-native English speakers by providing an automated workflow to translate LaTeX source code into well-formatted Chinese PDFs. It targets researchers and technical users who need to quickly digest academic content in their preferred language, offering a significant time-saving benefit and reducing language barriers.

How It Works

The core approach involves integrating with Agent Skills environments to automate the translation of arXiv papers. Upon receiving an arXiv identifier, the system fetches the LaTeX source, translates the main body text while preserving mathematical formulas, citations, and technical terms, and then submits the modified .tex file to an online LuaLaTeX compilation service. This method circumvents the need for local LaTeX installations and ensures superior fidelity in formatting and content accuracy compared to direct PDF translation.

Quick Start & Requirements

  • Primary install / run command (pip, Docker, binary, etc.).
    • Clone the repository (git clone https://github.com/Leey21/arxiv-translator) or download as a zip.
    • Install as an Agent Skill by providing the local path to the cloned repository to your agent (e.g., Codex, Claude Code, Cursor).
  • Non-default prerequisites and dependencies (GPU, CUDA >= 12, Python 3.12, large dataset, API keys, OS, hardware, etc.).
    • Python 3
    • requests library (pip install requests)
    • An Agent Skills compatible environment.
    • No local LaTeX distribution is required.
  • Estimated setup time or resource footprint.
    • Setup is minimal, primarily involving cloning and agent instruction. Compilation is handled remotely.
  • If they are present, include links to official quick-start, docs, demo, or other relevant pages.
    • GitHub Repository: https://github.com/Leey21/arxiv-translator
    • Online Compilation Service: https://latex.ytotech.com/builds/sync

Highlighted Details

  • Preserves document structure, mathematical formulas, citations, and academic terminology during translation.
  • Leverages an external HTTP API for LaTeX compilation, eliminating local environment setup.
  • Translates based on semantic document structure (sections, abstract) rather than arbitrary page splits for better context.
  • Generates .tex source files, facilitating review, diffing, and localized re-translation.

Maintenance & Community

  • Relies on the external LaTeX-On-HTTP service for compilation.
  • Community interaction is encouraged via GitHub Issues for feedback and suggestions.

Licensing & Compatibility

  • License: The repository's license is not specified in the README.
  • Compatibility: Designed for Agent Skills platforms. Requires arXiv papers to have publicly available LaTeX source code.

Limitations & Caveats

The tool is exclusively for arXiv papers that provide LaTeX source code; it cannot process pure PDF submissions. By default, translation focuses on the main body; translating appendices requires explicit user instruction. The absence of a specified license presents a significant adoption blocker for many use cases.

Health Check
Last Commit

14 hours ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
120 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.