myocr  by robbyzhaox

OCR framework for building custom pipelines

created 4 months ago
283 stars

Top 93.3% on sourcepulse

GitHubView on GitHub
Project Summary

MyOCR is an advanced OCR pipeline builder designed for engineers and researchers to create and integrate custom OCR systems. It offers a modular and extensible framework for end-to-end OCR development, enabling flexible training, integration of deep learning models, and production-ready deployment.

How It Works

MyOCR provides a unified pipeline for detection and recognition, allowing users to mix and match components like models and processors. It leverages ONNX runtime for efficient CPU/GPU inference and supports structured OCR output through integration with large language models like Qwen for data extraction.

Quick Start & Requirements

Highlighted Details

  • End-to-end OCR development framework with modular components.
  • Developer-friendly Python APIs and prebuilt pipelines.
  • ONNX runtime support for fast CPU/GPU inference.
  • Structured OCR output via LLM integration (Ollama, OpenAI).

Maintenance & Community

  • Active development with recent releases (v0.1.1 on May 17, 2025).
  • Contribution guidelines provided.

Licensing & Compatibility

  • Licensed under Apache 2.0.
  • Permissive license suitable for commercial use and closed-source integration.

Limitations & Caveats

The structured output pipeline requires configuration for LLM APIs (Ollama, OpenAI) and specific model setups. The README mentions a UI (doc-insight-ui) but does not provide a direct link.

Health Check
Last commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
4
Issues (30d)
0
Star History
271 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.