OCR for converting formula images to LaTeX expressions
Top 91.4% on sourcepulse
This repository provides an upgraded OCR solution for converting mathematical formula images into LaTeX expressions. It's designed for researchers and developers working on mathematical content digitization, offering a robust pipeline for accurate LaTeX generation from visual inputs.
How It Works
The core approach utilizes a CNN for feature extraction, followed by a bidirectional RNN on the height dimension to capture contextual dependencies in the image. A GRU-based decoder with an attention mechanism then generates the LaTeX output. Positional encoding, inspired by Transformers, is incorporated into the CNN's final layer. The system handles data preprocessing, including tokenization, padding, and bucketing, to optimize training.
Quick Start & Requirements
make im2katex-inference
for testing.sudo apt install imagemagick
, Mac: brew install imagemagick
), pdflatex.make server
.Highlighted Details
im2katex
(image to LaTeX), errorchecker
(syntax correction, currently deprecated), and dismodel
(discriminator for improving LaTeX renderability).Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The errorchecker
model, intended for correcting LaTeX syntax errors, is noted as deprecated due to poor performance. The dismodel
is an ongoing optimization effort. The lack of explicit licensing information may pose a barrier to commercial adoption.
5 years ago
Inactive