Discover and explore top open-source AI tools and projects—updated daily.
huridocsIntelligent PDF document analysis and content extraction service
Top 47.4% on SourcePulse
Summary
This Docker-powered microservice offers advanced PDF document layout analysis, OCR, and content extraction. It segments and classifies PDF elements like text, titles, images, and tables, determines reading order, and converts documents to formats such as Markdown and HTML, with integrated translation capabilities. It benefits users by providing a flexible, automated solution for complex PDF processing tasks.
How It Works
The service employs a Clean Architecture design for maintainability and testability. It offers two primary analysis models: the Vision Grid Transformer (VGT) for high-accuracy visual layout understanding, and LightGBM models for faster processing using XML-based features from Poppler. Integrated Tesseract OCR supports over 150 languages. A comprehensive RESTful API exposes functionalities for analysis, extraction, format conversion, and OCR.
Quick Start & Requirements
To start the service, use make start (or make start_translation for translation features). The service is accessible at http://localhost:5060. Prerequisites include Docker Desktop 4.25.0+ and Python 3.10+ for development. Optional NVIDIA Container Toolkit is recommended for GPU acceleration. System requirements are 2 GB RAM minimum, 5 GB GPU memory (optional), and 10 GB disk space. Project links to GitHub, HuggingFace, and Docker Hub are provided.
Highlighted Details
Maintenance & Community
The project is developed by HURIDOCS. Specific details regarding community channels (e.g., Discord, Slack), active contributors, or sponsorships are not detailed in the provided README.
Licensing & Compatibility
The specific open-source license is not explicitly stated in the provided README. Compatibility is enhanced by its Docker-based deployment, facilitating easier integration into various environments.
Limitations & Caveats
The quality of automatic translations is dependent on the chosen Ollama model; smaller models may yield suboptimal results. While GPU support is optional, the VGT model's performance is significantly enhanced by it. The specific license for commercial use or redistribution is not detailed.
2 weeks ago
Inactive