BetterOCR by junhoyeo

OCR tool combining multiple engines with LLM for improved text detection

Created 2 years ago

602 stars

Top 54.3% on SourcePulse

Project Summary

This project provides a robust solution for improving Optical Character Recognition (OCR) accuracy, particularly for non-English languages or noisy inputs. It targets developers and researchers needing enhanced text extraction by intelligently combining multiple OCR engines with Large Language Models (LLMs) for correction and reconstruction.

How It Works

BetterOCR leverages a multi-engine approach, integrating EasyOCR, Tesseract, and Pororo (for Korean/English). It then utilizes OpenAI's chat models to refine and correct the combined OCR outputs. An optional custom context feature allows users to provide specific keywords or product names, significantly improving accuracy for specialized terminology and reducing errors.

Quick Start & Requirements

Install via pip: pip install betterocr
Requires Python. Pororo integration has additional dependencies. OpenAI API key is recommended for LLM functionality.
See Examples for usage and performance.

Highlighted Details

Combines EasyOCR, Tesseract, and Pororo for text detection and recognition.
Employs OpenAI LLMs (GPT-3.5/GPT-4) for error correction and text reconstruction.
Supports custom context for improved accuracy with specific terminology.
Offers box detection functionality for locating text regions.
Demonstrates improved results across English, Korean, and Hindi examples.

Maintenance & Community

Project is under rapid development. Contributions are welcomed.
Supported by (주)한국모바일상품권(Korea Mobile Git Card, Inc.).

Licensing & Compatibility

Licensed under the MIT License.
Permissive license suitable for commercial and closed-source use.

Limitations & Caveats

The package is under rapid development, with features like async support and an improved interface noted as "coming soon." Performance may vary based on OCR engine updates and OpenAI API availability.

BetterOCR by junhoyeo

Explore Similar Projects

YomiNinja by matt-m-o

ollama-ocr by dwqs

Umi-OCR_plugins by hiroi-sora

benchmark by getomni-ai

llm-based-ocr by yigitkonur

awesome-ocr-resources by ZumingHuang

TTime by InkTimeRecord

deepdoctection by deepdoctection

AdvancedLiterateMachinery by AlibabaResearch

STranslate by STranslate

awesome-ocr by wanghaisheng

GOT-OCR2.0 by Ucas-HaoranWei