mindocr by mindspore-lab

OCR toolbox for text detection and recognition, based on MindSpore

Created 3 years ago

297 stars

Top 89.4% on SourcePulse

Project Summary

MindOCR is an open-source toolbox for Optical Character Recognition (OCR) development and deployment, built on the MindSpore framework. It offers a comprehensive suite of mainstream text detection and recognition models, along with user-friendly training and inference tools, targeting researchers and developers looking to accelerate OCR application development.

How It Works

MindOCR employs a modular design, decoupling OCR tasks into configurable components. This allows users to easily customize data processing pipelines, model architectures, and training/evaluation workflows by modifying configuration files. The toolbox integrates high-performance, pre-trained models that achieve competitive results on various OCR benchmarks.

Quick Start & Requirements

Installation:
- From source (recommended): git clone https://github.com/mindspore-lab/mindocr.git && cd mindocr && pip install -e .
- From PyPI: pip install mindocr (Note: PyPI version may be outdated).
- Docker images are available for specific Ascend hardware configurations.
Prerequisites: MindSpore (version compatibility matrix provided), Python >= 3.7, openmpi 4.0.3 (for distributed training).
Resources: Detailed installation guides and tutorials are available in the documentation.

Highlighted Details

Supports a wide range of text detection models (DBNet, DBNet++, PSENet, EAST) and recognition models (CRNN, SVTR, MASTER, ABINet, etc.).
Includes capabilities for layout analysis (YOLOv8), key information extraction (LayoutXLM, LayoutLMv3), and table recognition (TableMaster).
Offers tools for both online and offline inference, including support for MindSpore Lite.
Provides a dataset conversion tool and supports numerous public OCR datasets.

Maintenance & Community

The project is actively maintained by the MindSpore team, with frequent updates adding new models, datasets, and features. Contribution guidelines are available, and community support channels are not explicitly listed in the README.

Licensing & Compatibility

This project is licensed under the Apache License 2.0. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The PyPI installation is noted as potentially outdated. Docker images are specific to certain Ascend hardware and CANN versions, requiring careful environment setup for users without this hardware.

Health Check

Last Commit

5 months ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days