Discover and explore top open-source AI tools and projects—updated daily.
vorojarLocal batch OCR workbench for document digitization
Top 82.8% on SourcePulse
Summary Folio-OCR is an open-source, local batch OCR workbench designed as a free alternative to commercial solutions like ABBYY FineReader. It targets users digitizing books and documents, offering efficient, layout-aware processing and multiple export formats directly from a user-friendly interface.
How It Works This workbench leverages GLM-OCR and Ollama, featuring a distinctive three-panel editor for intuitive document processing. Its core innovation lies in layout detection, which automatically partitions documents and intelligently merges text regions, accelerating OCR by reducing redundant calls. The system also handles LaTeX special characters, converting them to Unicode, and performs automatic output cleanup.
Quick Start & Requirements
Installation is streamlined via Docker: clone the repository, run docker compose up -d, and pull the glm-ocr model with docker compose exec ollama ollama pull glm-ocr. Local installation requires Python 3.10+ and Ollama, followed by pip install -r requirements.txt and python server.py. NVIDIA GPU acceleration is available by uncommenting a section in docker-compose.yml.
Highlighted Details
folio_ocr.db), with auto-save and recovery of unsaved edits.Maintenance & Community
The project is hosted on GitHub at vorojar/Folio-OCR. No specific details regarding maintainers, community channels (e.g., Discord, Slack), or sponsorships were found in the provided README.
Licensing & Compatibility Folio-OCR is released under the MIT License, which is highly permissive and generally compatible with commercial use and closed-source projects.
Limitations & Caveats The initial model cold start time can be significant, around 50 seconds. Optimal performance, particularly the advertised ~0.5s/page, is contingent on having an NVIDIA GPU.
1 week ago
Inactive