DN_SuperBook_PDF_Converter by dnobori

AI tool for crystal-clear scanned PDF enhancement

Created 5 months ago

874 stars

Top 40.4% on SourcePulse

Project Summary

Summary

This project addresses the poor readability of scanned PDFs from physical books on digital devices. It offers an AI-powered tool to enhance scanned documents, removing artifacts like stains and ghosting, correcting alignment, and standardizing margins. Designed for individuals digitizing personal libraries, it aims to make scanned books as comfortable to read as professional e-books, enabling full-text search and improved study efficiency.

How It Works

The tool uses Real-ESRGAN for AI image upscaling and artifact removal. It employs OCR for page number recognition, coupled with heuristic algorithms to correct page offsets, uniformly trim margins, and align PDF viewer page numbers with book page numbers. Additional features include automatic detection and metadata embedding for double-page spreads and vertical text orientation.

Quick Start & Requirements

Installation: Clone the repo and build the C# solution in Visual Studio 2022/2026. A complex setup requires manual installation of multiple external tools (ExifTool, ImageMagick, Ghostscript, pdfcpu, qpdf, Tesseract OCR) and Python dependencies for Real-ESRGAN, including specific CUDA versions (cu128 recommended).
System: Windows 10/11 x64.
Hardware: Significant RAM (GBs to tens of GBs) and an NVIDIA GPU (8GB+ VRAM) are strongly recommended for practical processing; CPU-only operation is extremely slow.
Usage: Run the executable and use the ConvertPdf command with source/destination directories.

Highlighted Details

AI-powered enhancement via Real-ESRGAN for scanned book pages.
Automated correction of page alignment, margin cropping, and page number synchronization.
Intelligent detection and metadata tagging for book layout features (double-page spreads, vertical text).
Output PDFs are optimized for high-accuracy OCR, enabling full-text search.

Maintenance & Community

Developed for personal use and released due to demand. The author encourages forking for extensions rather than pull requests due to limited review time. No specific community channels or contributor details are provided.

Licensing & Compatibility

Core C# code is AGPL v3. External dependencies have their own licenses (GPL, Apache, ImageMagick). AGPLv3 is a strong copyleft license, potentially impacting commercial use or integration with closed-source projects. The tool is explicitly for personal use; redistribution of converted PDFs is warned against due to copyright.

Limitations & Caveats

Primarily Windows-focused; Linux/macOS support is not guaranteed. Setup is highly complex, requiring manual installation of numerous external tools. Processing is resource-intensive (RAM, GPU). Lacks built-in Japanese OCR for book content. Page number detection can fail for certain layouts. AGPLv3 and personal-use focus may restrict broader adoption.

DN_SuperBook_PDF_Converter by dnobori

Explore Similar Projects

pdfmd by M1ck4

docs by tesseract-ocr

awesome-ocr by zacharywhitley

paperless-gpt by icereed

nlm-ingestor by nlmatics

DeepSeek-OCR-2 by deepseek-ai

awesome-ocr by wanghaisheng

pdf-craft by oomol-lab

PDF-Extract-Kit by opendatalab

PyMuPDF by pymupdf

liteparse by run-llama

opendataloader-pdf by opendataloader-project