About me

🔍 I’m a fifth-year Ph.D. candidate at Rochester Institute of Technology (RIT), conducting research at the Document and Pattern Recognition Lab (DPRL), under the mentorship of Dr. Richard Zanibbi.

💡 My work centers around designing fast, efficient, and interpretable parsers for recognizing mathematical formulas and chemical diagrams from documents across multiple formats, including PDFs, typeset images, and handwritten strokes. Through graph attention-based techniques in a multi-task learning framework, I aim to enhance contextual information, while preserving a natural and interpretable graph representation.

đŸ’» Recent Projects:

  • Multiodal Chemical Search: A multimodal search tool for retrieving chemical reactions, molecular structures, and associated text from scientific literature, linking visual and textual representations of chemical information.
  • ChemScraper: A molecule diagram parser for extracting molecular graphics from PDFs.
  • MathDeck: A math-aware search system supporting formulas & text.

🌐 Research interests: Pattern recognition, recognition of graphical structures, computer vision, speaker understanding, large language models, multi-modal deep learning, natural language processing.

📰 News

2025


Feb 18, 2025

Submitted our paper on the “Multimodal Search in Chemical Documents and Reactions” for publication at Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, in SIGIR ’25, Padua, Italy. The system is available online at ReactionMiner.

2024


Dec 19, 2024

Successfully defended and passed my Ph.D. dissertation proposal on “Parsing of Math Formulas and Chemical Diagrams using Graph-Based Representation and Attention Models”.

Sept 3, 2024

Gave an oral presentation on “ChemScraper: Leveraging PDF Graphics Instructions for Molecular Diagram Parsing” at the 18th International Conference on Document Analysis and Recognition, ICDAR 2024, Athens, Greece.

May 2024

A revised paper on ChemScraper has been published at the 18th International Conference on Document Analysis and Recognition, ICDAR 2024, Athens, Greece – Journal Track. The paper describes (1) a fast and accurate technique for parsing born-digital (vector) PDF images, and (2) its use to create training data for a new approach to visual parsing of molecule diagrams in raster images (i.e., pixel-based such as from PNGs). Code is available and the system is online at ChemScraper.

2023


Nov 14, 2023

A paper describing ChemScraper parser for molecular diagrams in PDF drawing instructions (‘born-digital’) is available on arXiv here. The system can also generate annotated training data for visual parsers that recognize raster images (i.e., pixel-based, such as PNG). A link to associated code is provided in a footnote in the paper.

Sept 12, 2023

Co-presented a poster on “ChemScraper: Extracting Molecule Diagrams from PDF Vector and Raster Images with CDXML and SMILES Output” at the Molecule Maker Lab Institute **(MMLI) All-Institute Retreat at University of Illinois Urbana-Champaign (UIUC).

Aug 22-23, 2023

Gave a poster presentation talk at Poster session 1 and doctoral consortium at the 17th International Conference on Document Analysis and Recognition, ICDAR 2023, San José, California.

June 28, 2023

Co-presented a poster on “ChemScraper: Extracting Molecule Diagrams from PDF Vector Images with Page-Level CDXML (ChemDraw) and SMILES Output” at the NSF Annual Review Meeting at University of Illinois Urbana-Champaign (UIUC).

Apr 17, 2023

Gave a Research Idea Ring (RIR) talk on “Line-of-sight with Graph Attention Parser (LGAP) for Math Formulas” at RIT.

Apr 2023

Our paper on the “Line-of-sight with Graph Attention Parser (LGAP) for Math Formulas” accepted for publication at the 17th International Conference on Document Analysis and Recognition, ICDAR 2023, San JosĂ©, California.

2022


Sept 27-28, 2022

Co-presented a poster on “Reconstructing the Structure of Molecular Diagrams in PDF Documents using a CNN-Attention-Based Parsing Model” at the Molecule Maker Lab Institute **(MMLI) All-Institute Retreat at University of Illinois Urbana-Champaign (UIUC).

Sep 5, 2022

Gave a guest lecture on “Bayesian Decision Theory” for RIT’s undergraduate course - Intro to Machine Learning (40 students).

Aug 28, 2022

Successfully completed Applied Scientist Intern at Amazon (Alexa AI). Started as Graduate Teaching Assistant (GTA) for the undergraduate course CSCI-335 Machine Learning.

May 23, 2022

Started as Applied Scientist Intern at Amazon (Alexa AI). Worked on the Alexa Perceptual Technologies - Speaker Understanding team to improve speaker identification in Alexa devices.

Apr 7, 2022

Gave a Research Idea Ring (RIR) talk on “A Fast and Interpretable Context-aware Parser for Isolated Formulas and Chemical Diagrams” at RIT.

2021


Sept 9, 2021

Gave a poster presentation talk on the MathSeer extraction pipeline at the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, Lausanne, Switzerland virtually.

Sept, 2021

MathSeer extraction pipeline released. This tool extracts formula locations and content in PDF documents. The pipeline is available from GitLab, and includes improved versions of SymbolScraper, ScanSSD (now, ScanSSD-XYc), and QD-GGA. The pipeline was prepared by Ayush K. Shah, Abhisek Dey, Matt Langsenkamp, and Prof. Zanibbi.

May 2021

Successfully defended and passed my Ph.D. Research Potential Assessment (RPA) on “Recognition of Mathematical Formulas”

Apr 2021

Our paper on the “MathSeer formula extraction and evaluation pipeline” accepted for publication at the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, Lausanne, Switzerland.

2020


Aug 2020

Joined Rochester Institute of Technology (RIT) for Ph.D. in Computing and Information Sciences. Started as Graduate Research Assistant (GRA) at the Document and Pattern Recognition Lab (DPRL) under Prof. Richard Zanibbi.

Jan 2020

Promoted to Machine Learning Engineer Level 1 at Fusemachines Nepal.

2019


Aug 2019

Promoted to Machine Learning Engineer Associate at Fusemachines Nepal.

Aug 2019

Graduated from Kathmandu University as a Computer Engineer.

Jun 2019

Started working as a Machine Learning Engineer Trainee at Fusemachines Nepal.