DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization (original) (raw)

Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, 2016

In this work, we model abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks, and show that they achieve state-of-the-art performance on two different corpora. We propose several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling keywords , capturing the hierarchy of sentence-toword structure, and emitting words that are rare or unseen at training time. Our work shows that many of our proposed models contribute to further improvement in performance. We also propose a new dataset consisting of multi-sentence summaries, and establish performance benchmarks for further research.

Improving Neural Abstractive Text Summarization with Prior Knowledge (Position Paper)

International Conference of the Italian Association for Artificial Intelligence, 2016

ive text summarization is a complex task whose goal is to generate a concise version of a text without necessarily reusing the sentences from the original source, but still preserving the meaning and the key contents. In this position paper we address this issue by modeling the problem as a sequence to sequence learning and exploiting Recurrent Neural Networks (RNN). Moreover, we discuss the idea of combining RNNs and probabilistic models in a unified way in order to incorporate prior knowledge, such as linguistic features. We believe that this approach can obtain better performance than the state-of-the-art models for generating well-formed summaries.

Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents

2021

Text summarization is an essential task to help readers capture salient information from documents, news, interviews, and meetings. However, most state-of-the-art pretrained language models are unable to efficiently process long text commonly seen in the summarization problem domain. In this paper, we propose SUMM , a simple, flexible, and effective multi-stage framework for input texts that are longer than the maximum context lengths of typical pretrained LMs. SUMM first generates the coarse summary in multiple stages and then produces the final fine-grained summary based on them. The framework can process input text of arbitrary length by adjusting the number of stages, while keeping the LM context size fixed. Moreover, it can deal with both documents and dialogues, and can be used on top of any underlying backbone abstractive summarization model. Our experiments demonstrate that SUMM significantly outperforms previous state-of-the-art methods by improving ROUGE scores on three lo...

Assessing the Efficacy of LSTM, Transformer, and RNN Architectures in Text Summarization

International Conference on Applied Engineering and Natural Sciences

The need for efficient and effective techniques for automatic text summarization has become increasingly critical with the exponential growth of textual data in different domains. Summarizing long texts into short summaries facilitates a quick understanding of the key information contained in the documents. In this paper, we evaluate various architectures for automatic text summarization using the TEDx dataset, a valuable resource consisting of a large collection of TED talks with rich and informative speech transcripts. Our research focuses on evaluating the performance of Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Recurrent Neural Network (RNN) and Transformer architectures for automatic text summarization. We measure the accuracy of each model by comparing the generated summaries with human-written summaries. The findings show that the Transformer model achieves the highest accuracy, followed closely by the GRU model. However, LSTM, RNN exhibit relatively lower ac...

Abstractive Sentence Summarization with Attentive Recurrent Neural Networks

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

ive Sentence Summarization generates a shorter version of a given sentence while attempting to preserve its meaning. We introduce a conditional recurrent neural network (RNN) which generates a summary of an input sentence. The conditioning is provided by a novel convolutional attention-based encoder which ensures that the decoder focuses on the appropriate input words at each step of generation. Our model relies only on learned features and is easy to train in an end-to-end fashion on large data sets. Our experiments show that the model significantly outperforms the recently proposed state-of-the-art method on the Gigaword corpus while performing competitively on the DUC-2004 shared task.

VAE-PGN based Abstractive Model in Multi-stage Architecture for Text Summarization

Proceedings of the 12th International Conference on Natural Language Generation, 2019

This paper describes our submission to the TL;DR challenge. Neural abstractive summarization models have been successful in generating fluent and consistent summaries with advancements like the copy (Pointer-generator) and coverage mechanisms. However, these models suffer from their extractive nature as they learn to copy words from the source text. In this paper, we propose a novel abstractive model based on Variational Autoencoder (VAE) to address this issue. We also propose a Unified Summarization Framework for the generation of summaries. Our model eliminates non-critical information at a sentencelevel with an extractive summarization module and generates the summary word by word using an abstractive summarization module. To implement our framework, we combine submodules with state-of-the-art techniques including Pointer-Generator Network (PGN) and BERT while also using our new VAE-PGN abstractive model. We evaluate our model on the benchmark Reddit corpus as part of the TL;DR challenge and show that our model outperforms the baseline in ROUGE score while generating diverse summaries.

SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents

2017

We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to state-of-the-art. Our model has the additional advantage of being very interpretable, since it allows visualization of its predictions broken up by abstract features such as information content, salience and novelty. Another novel contribution of our work is abstractive training of our extractive model that can train on human generated reference summaries alone, eliminating the need for sentence-level extractive labels.

Transfer Learning for Abstractive Summarization at Controllable Budgets

ArXiv, 2020

Summarizing a document within an allocated budget while maintaining its major concepts is a challenging task. If the budget can take any arbitrary value and not known beforehand, it becomes even more difficult. Most of the existing methods for abstractive summarization, including state-of-the-art neural networks are data intensive. If the number of available training samples becomes limited, they fail to construct high-quality summaries. We propose MLS, an end-to-end framework to generate abstractive summaries with limited training data at arbitrary compression budgets. MLS employs a pair of supervised sequence-to-sequence networks. The first network called the \textit{MFS-Net} constructs a minimal feasible summary by identifying the key concepts of the input document. The second network called the Pointer-Magnifier then generates the final summary from the minimal feasible summary by leveraging an interpretable multi-headed attention model. Experiments on two cross-domain datasets ...

Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges

Mathematical Problems in Engineering

In recent years, the volume of textual data has rapidly increased, which has generated a valuable resource for extracting and analysing information. To retrieve useful knowledge within a reasonable time period, this information must be summarised. This paper reviews recent approaches for abstractive text summarisation using deep learning models. In addition, existing datasets for training and validating these approaches are reviewed, and their features and limitations are presented. The Gigaword dataset is commonly employed for single-sentence summary approaches, while the Cable News Network (CNN)/Daily Mail dataset is commonly employed for multisentence summary approaches. Furthermore, the measures that are utilised to evaluate the quality of summarisation are investigated, and Recall-Oriented Understudy for Gisting Evaluation 1 (ROUGE1), ROUGE2, and ROUGE-L are determined to be the most commonly applied metrics. The challenges that are encountered during the summarisation process ...

A Neural Attention Model for Abstractive Sentence Summarization

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.