TensorFlow implementation of an im2latex system
Top 91.2% on sourcepulse
This repository provides a TensorFlow implementation of a deep learning model designed to decompile images of rendered LaTeX formulas into their corresponding LaTeX source code. It targets researchers and developers interested in image-to-markup conversion, offering a solution to the im2latex problem by visually reconstructing mathematical expressions.
How It Works
The system employs an encoder-decoder architecture with an attention mechanism, mirroring the approach in the HarvardNLP paper "What You Get Is What You See: A Visual Markup Decompiler." The encoder processes the input image to extract visual features, while the decoder generates the LaTeX markup token by token. The attention mechanism allows the decoder to focus on relevant image regions during generation, improving accuracy for complex formulas.
Quick Start & Requirements
preprocess_images.py
, preprocess_formulas.py
, preprocess_filter.py
, generate_latex_vocab.py
).attention.py
.predict()
function in attention.py
or Predict.ipynb
.Highlighted Details
Maintenance & Community
The project is a personal implementation by ritheshkumar95, based on the HarvardNLP work. No specific community channels or active maintenance signals are evident in the README.
Licensing & Compatibility
The README does not explicitly state a license. The original HarvardNLP implementation is available under a permissive license. Compatibility with commercial or closed-source projects is not specified.
Limitations & Caveats
The preprocessing steps are extensive and require careful execution. The project relies on several external tools (Node.js, KaTeX, pdflatex, ImageMagick, Webkit2png) which may add complexity to the setup. The README does not detail performance on hardware other than the mentioned Nvidia M40.
3 years ago
1 day