Cross-Lingual Machine Reading Comprehension (original) (raw)


Abstract

Though the community has made great progress on Machine Reading Comprehension (MRC) task, most of the previous works are solving English-based MRC problems, and there are few efforts on other languages mainly due to the lack of large-scale training data. In this paper, we propose Cross-Lingual Machine Reading Comprehension (CLMRC) task for the languages other than English. Firstly, we present several back-translation approaches for CLMRC task which is straightforward to adopt. However, to exactly align the answer into source language is difficult and could introduce additional noise. In this context, we propose a novel model called Dual BERT, which takes advantage of the large-scale training data provided by rich-resource language (such as English) and learn the semantic relations between the passage and question in bilingual context, and then utilize the learned knowledge to improve reading comprehension performance of low-resource language. We conduct experiments on two Chinese machine reading comprehension datasets CMRC 2018 and DRCD. The results show consistent and significant improvements over various state-of-the-art systems by a large margin, which demonstrate the potentials in CLMRC task. Resources available: https://github.com/ymcui/Cross-Lingual-MRC

Anthology ID:

D19-1169

Volume:

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Month:

November

Year:

2019

Address:

Hong Kong, China

Editors:

Kentaro Inui,Jing Jiang,Vincent Ng,Xiaojun Wan

Venues:

EMNLP|IJCNLP

SIG:

SIGDAT

Publisher:

Association for Computational Linguistics

Note:

Pages:

1586–1595

Language:

URL:

https://aclanthology.org/D19-1169/

DOI:

10.18653/v1/D19-1169

Bibkey:

Cite (ACL):

Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, and Guoping Hu. 2019. Cross-Lingual Machine Reading Comprehension. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1586–1595, Hong Kong, China. Association for Computational Linguistics.

Cite (Informal):

Cross-Lingual Machine Reading Comprehension (Cui et al., EMNLP-IJCNLP 2019)

Copy Citation:

PDF:

https://aclanthology.org/D19-1169.pdf