Discover and explore top open-source AI tools and projects—updated daily.
xunbuTranslate documents locally with AI across multiple formats
Top 61.2% on SourcePulse
A lightweight, local file translation tool, DocuTranslate addresses the need for translating diverse document types like novels, theses, and subtitles using Large Language Models. It supports numerous formats including PDF, DOCX, XLSX, SRT, and EPUB, offering significant benefits through format preservation for Word and Excel, compatibility with multiple AI platforms, and an integrated web interface for streamlined usage.
How It Works
The project utilizes a workflow-based architecture, where distinct Workflow classes are designed for specific file types. Users select an appropriate workflow, configure its components (converter, translator, exporter), and then execute the translation process. PDF translation involves an initial conversion to Markdown, leveraging either the online mineru engine (which includes OCR capabilities and requires a token) or the local docling engine (requiring model downloads). Translations are powered by various LLM APIs, with support for custom prompts, automatic glossary generation, and asynchronous operations for enhanced performance.
Quick Start & Requirements
pip install docutranslate) or pip install docutranslate[docling] for local PDF parsing. Standalone packages are also available on GitHub Releases.mineru token is necessary for online PDF parsing, valid for 14 days.docling engine for local PDF parsing necessitates model downloads, which can be slow on the first run; network mirrors or offline packages are supported solutions.docutranslate -i, with the web interface accessible at http://127.0.0.1:8010 (default). API documentation is available at /docs.1047781902.Highlighted Details
Maintenance & Community
A QQ Discussion Group (1047781902) is provided for community interaction and support. Sponsorship is welcomed by the project maintainers.
Licensing & Compatibility
The project's license is not explicitly stated in the README. This omission makes it difficult to assess compatibility for commercial use or integration within closed-source projects.
Limitations & Caveats
Translation of PDF documents to Markdown may result in the loss of original layout. The docling PDF parsing engine can experience slow initial setup due to model downloads. mineru API tokens have a 14-day expiration, requiring periodic renewal. The absence of a clear license is a significant factor for potential adopters to consider.
1 day ago
Inactive