awesome-ocr-resources  by ZumingHuang

OCR resource collection (papers, datasets, APIs)

created 7 years ago
423 stars

Top 70.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a curated, comprehensive collection of resources for Optical Character Recognition (OCR) and Document AI, targeting researchers and practitioners in the field. It aims to consolidate papers, datasets, and APIs, providing a centralized hub for staying updated on the latest advancements and tools in OCR technology.

How It Works

The project functions as a living bibliography, meticulously organized by publication year and topic. It aggregates links to academic papers, open-source projects, and relevant APIs, facilitating discovery and access to critical OCR research and development resources. The structure allows users to easily navigate through historical and current trends in OCR.

Quick Start & Requirements

This repository is a curated list and does not require installation or execution. It serves as a reference guide.

Highlighted Details

  • Extensive categorization of papers from 2023-present, 2019-2022, 2015-2018, 2011-2014, and before 2010.
  • Includes links to multiple HCIILAB projects focused on scene text detection, recognition, and end-to-end solutions.
  • Curated lists specifically for scene text localization and recognition, deep learning methods, and general OCR resources.
  • Features a link to a 2024 ArXiv paper on scaling multimodal models for OCR performance.

Maintenance & Community

The repository is actively updated, with a recent addition of papers published in 2023 and 2024. No specific community channels or contributor information beyond the primary author are listed.

Licensing & Compatibility

The repository itself, as a collection of links and curated information, does not appear to have a specific license. Individual linked resources will have their own respective licenses.

Limitations & Caveats

This resource is a curated list and does not provide direct access to the papers or datasets themselves, only links to them. The "TODO" section indicates potential future additions or refinements.

Health Check
Last commit

6 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.