OnnxOCR  by jingsongliujing

Fast, framework-free OCR powered by ONNXRuntime

Created 2 years ago
1,811 stars

Top 23.4% on SourcePulse

GitHubView on GitHub
Project Summary

A high-performance, multilingual OCR system optimized for fast inference by decoupling from deep learning training frameworks and leveraging ONNXRuntime. It targets engineers and researchers needing deployable OCR solutions, offering cross-architecture compatibility and advanced integration capabilities.

How It Works

The project converts PaddleOCR models to ONNX format, enabling unified inference across x86 and ARM architectures via ONNXRuntime. A core inference_engine.py manages ONNX sessions. This approach facilitates deployment-readiness and allows integration of features like multilingual recognition, layout analysis, and information extraction with local LLMs.

Quick Start & Requirements

  • Primary install: pip install -r requirements.txt (using the specified index URL for China-based users).
  • Prerequisites: python>=3.8.
  • Models: Core PP-OCRv5 models are included. Additional models for specialized tasks (license plate, table, layout, etc.) must be downloaded on-demand using python scripts/download_models.py, which supports ModelScope (default) and HuggingFace sources.
  • Links: Example scripts (test_ocr.py, examples/) serve as functional demonstrations.

Highlighted Details

  • Supports multilingual OCR (Simplified Chinese, Traditional Chinese, Pinyin, English, Japanese) with single PP-OCRv5 models.
  • Integrates advanced modules for license plate recognition, table structure restoration (RapidTable), document layout analysis (RapidLayout), and PDF/image-to-Markdown conversion (RapidDoc).
  • Enables local information extraction by combining OCR results with a Qwen3.5-2B ONNX model, facilitating structured data output without external services.
  • Provides deployable JSON API (app-service.py) and web UI (webui.py) services, with Docker support for streamlined deployment.

Maintenance & Community

The project acknowledges contributions from PaddleOCR and RapidAI communities. Issues and Pull Requests are welcomed, but specific community channels or roadmaps are not detailed.

Licensing & Compatibility

The repository's license is not explicitly stated in the README, which is a critical omission for assessing commercial use or derivative works.

Limitations & Caveats

Specialized OCR features require downloading large, optional model files post-installation. The absence of a clear license is a significant adoption blocker.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
5
Star History
58 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.