Line-of-sight with Graph Attention Parser (LGAP) for Math Formulas

Published in International Conference on Document Analysis and Recognition (ICDAR), 2023

Recommended citation: A. K. Shah and R. Zanibbi, “Line-of-Sight with Graph Attention Parser (LGAP) for Math Formulas,” in Document Analysis and Recognition - ICDAR 2023, G. A. Fink, R. Jain, K. Kise, and R. Zanibbi, Eds., in Lecture Notes in Computer Science. Cham: Springer Nature Switzerland, 2023, pp. 401–419. doi: 10.1007/978-3-031-41734-4_25.

[url] [pdf] [poster] [video] [code]


Recently there have been notable advancements in encoder-decoder models for parsing the visual appearance of mathematical formulas. These approaches transform input formula images or handwritten stroke sequences into output strings (e.g., LaTeX) representing recognized symbols and their spatial arrangement on writing lines (i.e., a Symbol Layout Tree (SLT)). These sequential encoder-decoder models produce state-of-the-art results but suffer from a lack of interpretability: there is no direct mapping between image regions or handwritten strokes and detected symbols and relationships. In this paper, we present the Line-of-sight with Graph Attention Parser (LGAP), a visual parsing model that treats recognizing formula appearance as a graph search problem. LGAP produces an output SLT from a Maximum Spanning Tree (MST) over input primitives (e.g., connected components in images, or handwritten strokes). LGAP improves the earlier QD-GGA MST-based parser by representing punctuation relationships more consistently in ground truth, using additional context from line-of-sight graph neighbors in visual features, and pooling convolutional features using spatial pyramidal pooling rather than single-region average pooling. These changes improve accuracy while preserving the interpretibility of MST-based visual parsing.


author="Shah, Ayush Kumar and Zanibbi, Richard",                      
editor="Fink, Gernot A. and Jain, Rajiv and Kise, Koichi and Zanibbi, Richard",                      
title="Line-of-Sight with Graph Attention Parser (LGAP) for Math Formulas",
booktitle="Document Analysis and Recognition - ICDAR 2023",
publisher="Springer Nature Switzerland",    

Leave a Comment