FreePDF  by zstar1003

AI PDF reader for translation and Q&A

Created 6 months ago
342 stars

Top 80.8% on SourcePulse

GitHubView on GitHub
Project Summary

A free, open-source PDF reader designed to translate documents and enable large language model (LLM)-based Q&A on their content. It targets researchers, students, and professionals who frequently encounter documents in foreign languages, offering a streamlined workflow for comprehension and analysis without requiring complex setups for basic translation.

How It Works

The project leverages a document layout analysis model (DocLayout-YOLO) to detect text blocks within PDFs. It then employs configurable translation services (Bing, Google, Silicon, Ollama, custom) to convert specified languages (English, Chinese, Japanese, Korean, Traditional Chinese) into a target language, typically Chinese. For advanced interaction, an integrated QA engine, also configurable with services like Silicon or Ollama, allows users to ask questions directly about the PDF content, with responses generated based on the document's text.

Quick Start & Requirements

  • Installation:
    • Windows: Download FreePDF_v5.1.0_Setup.exe from GitHub releases or Baidu Netdisk (pwd: 8888).
    • macOS (arm64): Download FreePDF_v5.1.0_macOS.dmg from GitHub releases or run brew install freepdf.
    • Source: Clone the repository, run uv sync, then python main.py.
  • Prerequisites: Requires specific ONNX models for document layout and font files for rendering different languages. Translation/QA services may require API keys or local LLM setup (e.g., Ollama).
  • Links:

Highlighted Details

  • Supports translation between Chinese, English, Japanese, Korean, and Traditional Chinese.
  • Configurable translation and QA backends, including local Ollama models and cloud services like Silicon.
  • QA engine allows specifying page ranges and custom system prompts for tailored analysis.

Maintenance & Community

The project accepts contributions via standard GitHub pull requests. Users can report issues or provide feedback directly to the maintainer via WeChat (ID: zstar1003). An experimental dev branch exists for features like table translation.

Licensing & Compatibility

The provided README does not specify a software license. Compatibility for commercial use or linking within closed-source projects is undetermined without a license.

Limitations & Caveats

The tool does not support image-based PDFs (scanned documents) as it relies on text block replacement. Table translation is not supported in the main branch, though an experimental implementation exists in the dev branch. Translation quality with low-parameter LLMs may be inconsistent. Configuration UI issues may occur on certain display settings, requiring direct editing of the pdf2zh_config.json file.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
6
Star History
54 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
4 more.

olmocr by allenai

0.8%
17k
Toolkit for linearizing PDFs for LLM datasets/training
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.