DAGOBAH: Enhanced Scoring Algorithms for Scalable Annotations of Tabular Data (original) (raw)

We present new approaches used in the DAGOBAH system to perform automatic semantic table interpretation. DAGOBAH semantically annotates tables with Wikidata entities and relations to perform three tasks: Columns-Property Annotation (CPA), Cell-Entity Annotation (CEA) and Column-Type Annotation (CTA). In our system, the initial scores from entity disambiguation influence the CPA output, which, in turn, influences the output of the CEA. Finally, the CTA is computed using the type hierarchy available in the knowledge graph in order to annotate columns with the most suitable fine-grained types. This approach that leverages mutual influences between annotations allows DAGOBAH to obtain very competitive results on all tasks of the SemTab2020 challenge.