DN_SuperBook_PDF_Converter  by dnobori

AI tool for crystal-clear scanned PDF enhancement

Created 1 month ago
816 stars

Top 43.3% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project addresses the poor readability of scanned PDFs from physical books on digital devices. It offers an AI-powered tool to enhance scanned documents, removing artifacts like stains and ghosting, correcting alignment, and standardizing margins. Designed for individuals digitizing personal libraries, it aims to make scanned books as comfortable to read as professional e-books, enabling full-text search and improved study efficiency.

How It Works

The tool uses Real-ESRGAN for AI image upscaling and artifact removal. It employs OCR for page number recognition, coupled with heuristic algorithms to correct page offsets, uniformly trim margins, and align PDF viewer page numbers with book page numbers. Additional features include automatic detection and metadata embedding for double-page spreads and vertical text orientation.

Quick Start & Requirements

  • Installation: Clone the repo and build the C# solution in Visual Studio 2022/2026. A complex setup requires manual installation of multiple external tools (ExifTool, ImageMagick, Ghostscript, pdfcpu, qpdf, Tesseract OCR) and Python dependencies for Real-ESRGAN, including specific CUDA versions (cu128 recommended).
  • System: Windows 10/11 x64.
  • Hardware: Significant RAM (GBs to tens of GBs) and an NVIDIA GPU (8GB+ VRAM) are strongly recommended for practical processing; CPU-only operation is extremely slow.
  • Usage: Run the executable and use the ConvertPdf command with source/destination directories.

Highlighted Details

  • AI-powered enhancement via Real-ESRGAN for scanned book pages.
  • Automated correction of page alignment, margin cropping, and page number synchronization.
  • Intelligent detection and metadata tagging for book layout features (double-page spreads, vertical text).
  • Output PDFs are optimized for high-accuracy OCR, enabling full-text search.

Maintenance & Community

Developed for personal use and released due to demand. The author encourages forking for extensions rather than pull requests due to limited review time. No specific community channels or contributor details are provided.

Licensing & Compatibility

Core C# code is AGPL v3. External dependencies have their own licenses (GPL, Apache, ImageMagick). AGPLv3 is a strong copyleft license, potentially impacting commercial use or integration with closed-source projects. The tool is explicitly for personal use; redistribution of converted PDFs is warned against due to copyright.

Limitations & Caveats

Primarily Windows-focused; Linux/macOS support is not guaranteed. Setup is highly complex, requiring manual installation of numerous external tools. Processing is resource-intensive (RAM, GPU). Lacks built-in Japanese OCR for book content. Page number detection can fail for certain layouts. AGPLv3 and personal-use focus may restrict broader adoption.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
9
Star History
90 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.