OnnxOCR by jingsongliujing

Fast, framework-free OCR powered by ONNXRuntime

Created 3 years ago

1,836 stars

Top 22.7% on SourcePulse

Project Summary

A high-performance, multilingual OCR system optimized for fast inference by decoupling from deep learning training frameworks and leveraging ONNXRuntime. It targets engineers and researchers needing deployable OCR solutions, offering cross-architecture compatibility and advanced integration capabilities.

How It Works

The project converts PaddleOCR models to ONNX format, enabling unified inference across x86 and ARM architectures via ONNXRuntime. A core inference_engine.py manages ONNX sessions. This approach facilitates deployment-readiness and allows integration of features like multilingual recognition, layout analysis, and information extraction with local LLMs.

Quick Start & Requirements

Primary install: pip install -r requirements.txt (using the specified index URL for China-based users).
Prerequisites: python>=3.8.
Models: Core PP-OCRv5 models are included. Additional models for specialized tasks (license plate, table, layout, etc.) must be downloaded on-demand using python scripts/download_models.py, which supports ModelScope (default) and HuggingFace sources.
Links: Example scripts (test_ocr.py, examples/) serve as functional demonstrations.

Highlighted Details

Supports multilingual OCR (Simplified Chinese, Traditional Chinese, Pinyin, English, Japanese) with single PP-OCRv5 models.
Integrates advanced modules for license plate recognition, table structure restoration (RapidTable), document layout analysis (RapidLayout), and PDF/image-to-Markdown conversion (RapidDoc).
Enables local information extraction by combining OCR results with a Qwen3.5-2B ONNX model, facilitating structured data output without external services.
Provides deployable JSON API (app-service.py) and web UI (webui.py) services, with Docker support for streamlined deployment.

Maintenance & Community

The project acknowledges contributions from PaddleOCR and RapidAI communities. Issues and Pull Requests are welcomed, but specific community channels or roadmaps are not detailed.

Licensing & Compatibility

The repository's license is not explicitly stated in the README, which is a critical omission for assessing commercial use or derivative works.

Limitations & Caveats

Specialized OCR features require downloading large, optional model files post-installation. The absence of a clear license is a significant adoption blocker.

OnnxOCR by jingsongliujing

Explore Similar Projects

YomiNinja by matt-m-o

ollama-ocr by bytefer

Versatile-OCR-Program by raphael-seo

Umi-OCR_plugins by hiroi-sora

deepdoctection by deepdoctection

PolyglotPDF by CBIhalsen

text-extract-api by CatchTheTornado

STranslate by STranslate

Bob by ripperhe

RapidOCR by RapidAI

GOT-OCR2.0 by Ucas-HaoranWei

liteparse by run-llama