Constrained BERT BiLSTM CRF for understanding multi-sentence entity-seeking questions (original) (raw)

Understanding Complex Multi-sentence Entity seeking Questions

2019

We present the novel task of understanding multi-sentence entity-seeking questions (MSEQs) i.e, questions that may be expressed in multiple sentences, and that expect one or more entities as an answer. We formulate the problem of understanding MSEQs as a semantic labeling task over an open representation that makes minimal assumptions about schema or ontology specific semantic vocabulary. At the core of our model, we use a BiDiLSTM (bi-directional LSTM) CRF and to overcome the challenges of operating with low training data, we supplement it by using hand-designed features, as well as hard and soft constraints spanning multiple sentences. We find that this results in a 6-7pt gain over a vanilla BiDiLSTM CRF. We demonstrate the strengths of our work using the novel task of answering real-world entity-seeking questions from the tourism domain. The use of our labels helps answer 53% more questions with 42 % more accuracy as compared to baselines.

Evaluating Entity Models on the TREC Question Answering Task

2010

ABSTRACT We propose entity models, a representation of the language used to describe a named entity (person, organization, or location). The model is purely statistical and constructed from snippets of text surrounding mentions of an entity. We evaluate the effectiveness of entity models for fact-based question answering. The results obtained on question answering are promising indicating that entity models contain useful information which would aid textual data mining and other related tasks.

CLEF 2009 Question Answering Experiments at Tokyo Institute of Technology

2015

In this paper we describe the experiments carried out at Tokyo Institute of Technology for the CLEF 2009 Question Answering on Speech Transcriptions (QAST) task, where we participated in the English track. We apply a non-linguistic, data-driven approach to Question Answering (QA). Relevant sentences are rst retrieved from the supplied corpus, using a language model based sentence retrieval module. Our probabilistic answer extraction module then pinpoints exact answers in these sentences. In this year's QAST task the question set contains both factoid and non-factoid questions, where the non-factoid questions ask for denitions of given named entities. We do not make any adjustments of our factoid QA system to account for non-factoid questions. Moreover, we are presented with the challenge of searching for the right answer in a relatively small corpus. Our system is built to take advantage of redundant information in large corpora, however, in this task such redundancy is not ava...

RECURRENT CONDITIONAL RANDOM FIELD FOR LANGUAGE UNDERSTANDING

Recurrent neural networks (RNNs) have recently produced record setting performance in language modeling and word-labeling tasks. In the word-labeling task, the RNN is used analogously to the more traditional conditional random field (CRF) to assign a label to each word in an input sequence, and has been shown to significantly outperform CRFs. In contrast to CRFs, RNNs operate in an online fashion to assign labels as soon as a word is seen, rather than after seeing the whole word sequence. In this paper, we show that the performance of an RNN tagger can be significantly improved by incorporating elements of the CRF model; specifically, the explicit modeling of output-label dependencies with transition features, its global sequence-level objective function, and offline decoding. We term the resulting model a "recurrent conditional random field" and demonstrate its effectiveness on the ATIS travel domain dataset and a variety of web-search language understanding datasets.

QuesBELM: A BERT based Ensemble Language Model for Natural Questions

2020 5th International Conference on Computing, Communication and Security (ICCCS), 2020

A core goal in artificial intelligence is to build systems that can read the web, and then answer complex questions related to random searches about any topic. These question-answering (QA) systems could have a big impact on the way that we access information. In this paper, we addressed the task of question-answering (QA) systems on Google’s Natural Questions (NQ) dataset containing real user questions issued to Google search and the answers found from Wikipedia by annotators. In our work, we systematically compare the performance of powerful variant models of Transformer architectures-`BERTbase, BERT-large-WWM and ALBERT-XXL’ over Natural Questions dataset. We also propose a state-of-the-art BERT based ensemble language model-QuesBELM. QuesBELM leverages the power of existing BERT variants combined together to build a more accurate stacking ensemble model for question answering (QA) system. The model integrates top-K predictions from single language models to determine the best an...

Distilling Task-Specific Knowledge from BERT into Simple Neural Networks

In the natural language processing literature, neural networks are becoming increasingly deeper and complex. The recent poster child of this trend is the deep language representation model, which includes BERT, ELMo, and GPT. These developments have led to the conviction that previous-generation, shallower neural networks for language understanding are obsolete. In this paper, however, we demonstrate that rudimentary, lightweight neural networks can still be made competitive without architecture changes, external training data, or additional input features. We propose to distill knowledge from BERT, a state-of-the-art language representation model, into a single-layer BiLSTM, as well as its siamese counterpart for sentence-pair tasks. Across multiple datasets in paraphrasing, natural language inference, and sentiment classification, we achieve comparable results with ELMo, while using roughly 100 times fewer parameters and 15 times less inference time.

Bidirectional long-short term memory and conditional random field for tourism named entity recognition

IAES International Journal of Artificial Intelligence (IJ-AI)

The common thing to do when planning a trip is to search for a tourist destination. This process is often done using search engines and reading articles on the internet. However, it takes much time to search for such information, as to obtain relevant information, we have to read some available articles. Named entity recognition (NER) can detect named entities in a text to help users find the desired information. This study aims to create a NER model that will help to detect tourist attractions in an article. The articles used for the dataset are English articles obtained from the internet. We built our NER model using bidirectional long-short term memory (BiLSTM) and conditional random fields (CRF), with Word2Vec as a feature. Our proposed model achieved the best with an average F1-Score of 75.25% compared to all scenarios tested.

Named entity recognition for question answering

2006

Current text-based question answering (QA) systems usually contain a named entity recogniser (NER) as a core component. Named entity recognition has traditionally been developed as a component for information extraction systems, and current techniques are focused on this end use. However, no formal assessment has been done on the characteristics of a NER within the task of question answering. In this paper we present a NER that aims at higher recall by allowing multiple entity labels to strings. The NER is embedded in a question answering system and the overall QA system performance is compared to that of one with a traditional variation of the NER that only allows single entity labels. It is shown that the added noise produced introduced by the additional labels is offset by the higher recall gained, therefore enabling the QA system to have a better chance to find the answer.

Drexel at TREC 2007: Question Answering

The TREC Question Answering Track presented several distinct challenges to participants in 2007. Participants were asked to create a system which discovers the answers to factoid and list questions about people, entities, organizations and events, given both blog and newswire text data sources. In addition, participants were asked to expose interesting information nuggets which exist in the data collection, which were not uncovered by the factoid or list questions. This year is the first time the Intelligent Information Processing group at Drexel has participated in the TREC Question Answering Track. As such, our goal was the development of a Question Answering system framework to which future enhancements could be made, and the construction of simple components to populate the framework. The results of our system this year were not significant; our primary accomplishment was the establishment of a baseline system which can be improved upon in 2008 and going forward.