Image2Katex  by xiaofengShi

OCR for converting formula images to LaTeX expressions

created 6 years ago
292 stars

Top 91.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides an upgraded OCR solution for converting mathematical formula images into LaTeX expressions. It's designed for researchers and developers working on mathematical content digitization, offering a robust pipeline for accurate LaTeX generation from visual inputs.

How It Works

The core approach utilizes a CNN for feature extraction, followed by a bidirectional RNN on the height dimension to capture contextual dependencies in the image. A GRU-based decoder with an attention mechanism then generates the LaTeX output. Positional encoding, inspired by Transformers, is incorporated into the CNN's final layer. The system handles data preprocessing, including tokenization, padding, and bucketing, to optimize training.

Quick Start & Requirements

Highlighted Details

  • Implements three models: im2katex (image to LaTeX), errorchecker (syntax correction, currently deprecated), and dismodel (discriminator for improving LaTeX renderability).
  • Supports training on handwritten, printed, or merged datasets.
  • Offers both greedy and beam search decoding for LaTeX generation.
  • Pre-trained weights are available via BaiduDisk.

Maintenance & Community

  • The project references OpenAI's research and provides links to related GitHub repositories and explanations.
  • No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • The README does not explicitly state a license. The presence of links to OpenAI's "Requests For Research" and other repositories suggests potential non-commercial or research-focused usage, but explicit licensing terms are absent.

Limitations & Caveats

The errorchecker model, intended for correcting LaTeX syntax errors, is noted as deprecated due to poor performance. The dismodel is an ongoing optimization effort. The lack of explicit licensing information may pose a barrier to commercial adoption.

Health Check
Last commit

5 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
4 more.

open_flamingo by mlfoundations

0.1%
4k
Open-source framework for training large multimodal models
created 2 years ago
updated 11 months ago
Feedback? Help us improve.