FreePDF by zstar1003

AI PDF reader for translation and Q&A

Created 1 year ago

388 stars

Top 73.4% on SourcePulse

Project Summary

A free, open-source PDF reader designed to translate documents and enable large language model (LLM)-based Q&A on their content. It targets researchers, students, and professionals who frequently encounter documents in foreign languages, offering a streamlined workflow for comprehension and analysis without requiring complex setups for basic translation.

How It Works

The project leverages a document layout analysis model (DocLayout-YOLO) to detect text blocks within PDFs. It then employs configurable translation services (Bing, Google, Silicon, Ollama, custom) to convert specified languages (English, Chinese, Japanese, Korean, Traditional Chinese) into a target language, typically Chinese. For advanced interaction, an integrated QA engine, also configurable with services like Silicon or Ollama, allows users to ask questions directly about the PDF content, with responses generated based on the document's text.

Quick Start & Requirements

Installation:
- Windows: Download FreePDF_v5.1.0_Setup.exe from GitHub releases or Baidu Netdisk (pwd: 8888).
- macOS (arm64): Download FreePDF_v5.1.0_macOS.dmg from GitHub releases or run brew install freepdf.
- Source: Clone the repository, run uv sync, then python main.py.
Prerequisites: Requires specific ONNX models for document layout and font files for rendering different languages. Translation/QA services may require API keys or local LLM setup (e.g., Ollama).
Links:
- Releases: https://github.com/zstar1003/FreePDF/releases
- Baidu Netdisk: https://pan.baidu.com/s/1Q4wyrLXQDovLmeBP4aP4Zw (pwd: 8888)

Highlighted Details

Supports translation between Chinese, English, Japanese, Korean, and Traditional Chinese.
Configurable translation and QA backends, including local Ollama models and cloud services like Silicon.
QA engine allows specifying page ranges and custom system prompts for tailored analysis.

Maintenance & Community

The project accepts contributions via standard GitHub pull requests. Users can report issues or provide feedback directly to the maintainer via WeChat (ID: zstar1003). An experimental dev branch exists for features like table translation.

Licensing & Compatibility

The provided README does not specify a software license. Compatibility for commercial use or linking within closed-source projects is undetermined without a license.

Limitations & Caveats

The tool does not support image-based PDFs (scanned documents) as it relies on text block replacement. Table translation is not supported in the main branch, though an experimental implementation exists in the dev branch. Translation quality with low-parameter LLMs may be inconsistent. Configuration UI issues may occur on certain display settings, requiring direct editing of the pdf2zh_config.json file.

Health Check

Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days