Coreference Resolution without Span Representations (original) (raw)
Related papers
Incorporating contextual cues in trainable models for coreference resolution
We propose a method that incorporates various novel contextual cues into a ma-chine learning for resolving coreference. Distinct characteristics of our model are (i) incorporating more linguistic fea-tures capturing contextual information that is more sophisticated than what is offered in Centering Theory, and (ii) a tournament model for selecting a ref-erent. Our experiments show that this model significantly outperforms earlier machine learning approaches, such as Soon et al. (2001).
Coreference Resolution: Toward End-to-End and Cross-Lingual Systems
Information, 2020
The task of coreference resolution has attracted considerable attention in the literature due to its importance in deep language understanding and its potential as a subtask in a variety of complex natural language processing problems. In this study, we outlined the field’s terminology, describe existing metrics, their differences and shortcomings, as well as the available corpora and external resources. We analyzed existing state-of-the-art models and approaches, and reviewed recent advances and trends in the field, namely end-to-end systems that jointly model different subtasks of coreference resolution, and cross-lingual systems that aim to overcome the challenges of less-resourced languages. Finally, we discussed the main challenges and open issues faced by coreference resolution systems.
ArXiv, 2020
In the last decade, the field of Neural Language Modelling has witnessed enormous changes, with the development of novel models through the use of Transformer architectures. However, even these models struggle to model long sequences due to memory constraints and increasing computational complexity. Coreference annotations over the training data can provide context far beyond the modelling limitations of such language models. In this paper we present an extension over the Transformer-block architecture used in neural language models, specifically in GPT2, in order to incorporate entity annotations during training. Our model, GPT2E, extends the Transformer layers architecture of GPT2 to Entity-Transformers, an architecture designed to handle coreference information when present. To that end, we achieve richer representations for entity mentions, with insignificant training cost. We show the comparative model performance between GPT2 and GPT2E in terms of Perplexity on the CoNLL 2012 ...