Encode Geometric Diagram as Geo-Graph in Geometry Problem Solving (original) (raw)
Authors
- Wenjun Wu Xi'an Jiaotong University
- Lingling Zhang Xi'an Jiaotong University
- Bo Zhao Xi'an Jiaotong University
- Bo Li Xi'an Jiaotong University
- Xinyu Zhang Xi'an Jiaotong University
- Yaqiang Wu Lenovo Research
DOI:
https://doi.org/10.1609/aaai.v40i23.39020
Abstract
Geometry Problem Solving has become a hot topic these years due to its complexity of enabling the machine with geometric abstraction, multi-modal reasoning and mathematical capabilities. Majority of research works place their attention on the fusion of multi-modal data or the synergistic combination of neural and symbolic systems for performance improvement. However, their neglect of the unique characteristics of geometric diagrams, which distinguish them from natural images, impedes the further exploring of critical information in geometric diagrams. In this work, we introduce the novel concept of geo-graph and propose the Geo-Graph Geometry Problem Solving model which encodes the geometric diagram from a new perspective. The geo-graph is designed to include semantic, structural and spatial information in the diagram, which is crucial to subsequent problem reasoning stage. To facilitate the model's comprehension of the actual layout of geometric diagram, spatial and connecting attentions are devised to serve as intrinsic knowledge guidance for feature propagation. An extra cross-modal attention is used as external guidance to instruct the encoding of geo-graph to be related to specific problem target. Fused multi-modal features are then sent into a commonly used encoder-decoder framework for final solution generation. The model is first trained with three carefully designed pre-training tasks to establish its fundamental knowledge of geo-graph, leveraging numerous varied samples generated through a geo-graph-based augmentation method. Experiments on popular geometry problem solving datasets demonstrate the effectiveness and superiority of our model for geometric diagram encoding.
How to Cite
Wu, W., Zhang, L., Zhao, B., Li, B., Zhang, X., & Wu, Y. (2026). Encode Geometric Diagram as Geo-Graph in Geometry Problem Solving. Proceedings of the AAAI Conference on Artificial Intelligence, 40(23), 19424-19432. https://doi.org/10.1609/aaai.v40i23.39020
Issue
Section
AAAI Technical Track on Knowledge Representation and Reasoning