Using BERT Encoding and Sentence-Level Language Model for Sentence Ordering (original) (raw)

A New Sentence Ordering Method using BERT Pretrained Model

2020

Building systems with capability of natural language understanding (NLU) has been one of the oldest areas of AI. An essential component of NLU is to detect logical succession of events contained in a text. The task of sentence ordering is proposed to learn succession of events with applications in AI tasks. The performance of previous works employing statistical methods is poor, while the neural networks-based approaches are in serious need of large corpora for model learning. In this paper, we propose a method for sentence ordering which does not need a training phase and consequently a large corpus for learning. To this end, we generate sentence embedding using BERT pre-trained model and measure sentence similarity using cosine similarity score. We suggest this score as an indicator of sequential events' level of coherence. We finally sort the sentences through brute-force search to maximize overall similarities of the sequenced sentences. Our proposed method outperformed othe...

A Simple yet Effective Method for Sentence Ordering

2021

Sentence ordering is the task of arranging a given bag of sentences so as to maximise the coherence of the overall text. In this work, we propose a simple yet effective training method that improves the capacity of models to capture overall text coherence based on training over pairs of sentences/segments. Experimental results show the superiority of our proposed method in in- and cross-domain settings. The utility of our method is also verified over a multi-document summarisation task.

CA-RNN: Using Context-Aligned Recurrent Neural Networks for Modeling Sentence Similarity

Proceedings of the AAAI Conference on Artificial Intelligence

The recurrent neural networks (RNNs) have shown good performance for sentence similarity modeling in recent years. Most RNNs focus on modeling the hidden states based on the current sentence, while the context information from the other sentence is not well investigated during the hidden state generation. In this paper, we propose a context-aligned RNN (CA-RNN) model, which incorporates the contextual information of the aligned words in a sentence pair for the inner hidden state generation. Specifically, we first perform word alignment detection to identify the aligned words in the two sentences. Then, we present a context alignment gating mechanism and embed it into our model to automatically absorb the aligned words' context for the hidden state update. Experiments on three benchmark datasets, namely TREC-QA and WikiQA for answer selection and MSRP for paraphrase identification, show the great advantages of our proposed model. In particular, we achieve the new state-of-the-art...

BERT4SO: Neural Sentence Ordering by Fine-tuning BERT

2021

Sentence ordering aims to arrange the sentences of a given text in the correct order. Recent work frames it as a ranking problem and applies deep neural networks to it. In this work, we propose a new method, named BERT4SO, by fine-tuning BERT for sentence ordering. We concatenate all sentences and compute their representations by using multiple special tokens and carefully designed segment (interval) embeddings. The tokens across multiple sentences can attend to each other which greatly enhances their interactions. We also propose a margin-based listwise ranking loss based on ListMLE to facilitate the optimization process. Experimental results on five benchmark datasets demonstrate the effectiveness of our proposed method.

Searching for grammaticality: Propagating dependencies in the viterbi algorithm

2005

In many text-to-text generation scenarios (for instance, summarisation), we encounter humanauthored sentences that could be composed by recycling portions of related sentences to form new sentences. In this paper, we couch the generation of such sentences as a search problem. We investigate a statistical sentence generation method which recombines words to form new sentences. We propose an extension to the Viterbi algorithm designed to improve the grammaticality of generated sentences. Within a statistical framework, the extension favours those partially generated strings with a probable dependency tree structure. Our preliminary evaluations show that our approach generates less fragmented text than a bigram baseline.

Exploration on Effectiveness and Efficiency of Similar Sentence Matching

Polibits, 2013

Similar sentence matching is an essential issue for many applications, such as text summarization, image extraction, social media retrieval, question-answer model, and so on. A number of studies have investigated this issue in recent years. Most of such techniques focus on effectiveness issues but only a few focus on efficiency issues. In this paper, we address both effectiveness and efficiency in the sentence similarity matching. For a given sentence collection, we determine how to effectively and efficiently identify the top-k semantically similar sentences to a query. To achieve this goal, we first study several representative sentence similarity measurement strategies, based on which we deliberately choose the optimal ones through cross validation and dynamically weight tuning. The experimental evaluation demonstrates the effectiveness of our strategy. Moreover, from the efficiency aspect, we introduce several optimization techniques to improve the performance of the similarity computation. The trade-off between the effectiveness and efficiency is further explored by conducting extensive experiments.

Application of Neural Networks for Sentence Similarity

This paper presents sentence similarity Measure for the matching of two sentences using convolutional neural networks. This model also checks the degree of similarity between the matched sentences. A self-learning model is designed to detect the similarity and degree of similarity between the two matched sentences also identifies the dissimilar sentences. Matching two potential sentences is central to many natural language applications. It generalizes the conventional notion of similarity or relevance, since it aims to model the correspondence between "linguistic objects" of different nature at different levels of abstractions. It will require a maxpooling in every two-unit window. The twofold pooling effects are: shrink the size of the representation by half, thus quickly absorbing the differences in length for sentence representation; filter out undesirable composition of words.

Improving Sentence Retrieval Using Sequence Similarity

Applied Sciences, 2020

Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or novelty detection. Since it is similar to document retrieval but with a smaller unit of retrieval, methods for document retrieval are also used for sentence retrieval like term frequency—inverse document frequency (TF-IDF), BM 25 , and language modeling-based methods. The effect of partial matching of words to sentence retrieval is an issue that has not been analyzed. We think that there is a substantial potential for the improvement of sentence retrieval methods if we consider this approach. We adapted TF-ISF, BM 25 , and language modeling-based methods to test the partial matching of terms through combining sentence retrieval with sequence similarity, which allows matching of words that are similar but not identical. All tests were conducted using data from the novelty tracks of the Text Retrieval Conference...

A context-aware study for sentence ordering

Telecommunication Systems, 2011

This paper proposes a context-aware method for sentence ordering in multi-document summarization task, which combines support vector machine (SVM) and Grey Model (GM). Multi-Documents summarization task focus on how to extract main information of document set, this paper aim to prove the coherence of summary based on the context of document set. Firstly, the method trains the SVM with sentences of each source document and predict sentences sequence of summary as primary dataset. Secondly, using Grey Model to process the primary dataset, according to the analysis we achieve the final sequence of summary sentences. Experiments on 100 summaries shown this method provide a much higher precision than probabilistic model in sentence ordering task.

Analysis of sentence embedding models using prediction tasks in natural language processing

IBM Journal of Research and Development, 2017

The tremendous success of word embeddings in improving the ability of computers to perform natural language tasks has shifted the research on language representation from word representation to focus on sentence representation. This shift introduced a plethora of methods for learning vector representations of sentences, many of them based on compositional methods over word embeddings. These vectors are used as features for subsequent machine learning tasks or for pretraining in the context of deep learning. However, not much is known about the properties that are encoded in these sentence representations and about the language information they encapsulate. Recent studies analyze the encoded representations and the kind of information they capture. In this paper, we analyze results from a previous study on the ability of models to encode basic properties such as content, order, and length. Our analysis led to new insights, such as the effect of word frequency or word distance on the ability to encode content and order.