Query Focused Abstractive Summarization via Incorporating Query Relevance and Transfer Learning with Transformer Models (original) (raw)

Domain Adaptation with Pre-trained Transformers for Query-Focused Abstractive Text Summarization

Computational Linguistics, 2022

The Query-Focused Text Summarization (QFTS) task aims at building systems that generate the summary of the text document(s) based on the given query. A key challenge in addressing this task is the lack of large labeled data for training the summarization model. In this article, we address this challenge by exploring a series of domain adaptation techniques. Given the recent success of pre-trained transformer models in a wide range of natural language processing tasks, we utilize such models to generate abstractive summaries for the QFTS task for both single-document and multi-document scenarios. For domain adaptation, we apply a variety of techniques using pre-trained transformer-based summarization models including transfer learning, weakly supervised learning, and distant supervision. Extensive experiments on six datasets show that our proposed approach is very effective in generating abstractive summaries for the QFTS task while setting a new state-of-the-art result in several da...

Transfer Learning for Abstractive Summarization at Controllable Budgets

ArXiv, 2020

Summarizing a document within an allocated budget while maintaining its major concepts is a challenging task. If the budget can take any arbitrary value and not known beforehand, it becomes even more difficult. Most of the existing methods for abstractive summarization, including state-of-the-art neural networks are data intensive. If the number of available training samples becomes limited, they fail to construct high-quality summaries. We propose MLS, an end-to-end framework to generate abstractive summaries with limited training data at arbitrary compression budgets. MLS employs a pair of supervised sequence-to-sequence networks. The first network called the \textit{MFS-Net} constructs a minimal feasible summary by identifying the key concepts of the input document. The second network called the Pointer-Magnifier then generates the final summary from the minimal feasible summary by leveraging an interpretable multi-headed attention model. Experiments on two cross-domain datasets ...

Neural Attention Model for Abstractive Text Summarization Using Linguistic Feature Space

IEEE Access

Summarization generates a brief and concise summary which portrays the main idea of the source text. There are two forms of summarization: abstractive and extractive. Extractive summarization chooses important sentences from the text to form a summary whereas abstractive summarization paraphrase using advanced and nearer-to human explanation by adding novel words or phrases. For a human annotator, producing summary of a document is time consuming and expensive because it requires going through the long document and composing a short summary. An automatic feature-rich model for text summarization is proposed that can reduce the amount of labor and produce a quick summary by using both extractive and abstractive approach. A feature-rich extractor highlights the important sentences in the text and linguistic characteristics are used to enhance results. The extracted summary is then fed to an abstracter to further provide information using features such as named entity tags, part of speech tags and term weights. Furthermore, a loss function is introduced to normalize the inconsistency between word-level and sentencelevel attentions. The proposed two-staged network achieved a ROUGE score of 37.76% on the benchmark CNN/DailyMail dataset, outperforming the earlier work. Human evaluation is also conducted to measure the comprehensiveness, conciseness and informativeness of the generated summary.

SETA- Extractive to Abstractive Summarization with a Similarity-Based Attentional Encoder-Decoder Model

ECTI Transactions on Computer and Information Technology, 2024

Article information: Summarizing information provided within tables of scientic documents has always been a problem. A system that can summarize this vital information, which a table encapsulates, can provide readers with a quick and straightforward solution to comprehend the contents of the document. To train such systems, we need data, and nding a quality one is tricky. To mitigate this challenge, we developed a high-quality corpus that contains both extractive and abstractive summaries derived from tables, using a rule-based approach. This dataset was validated using a combination of automated and manual metrics. Subsequently, we developed a novel Encoder-Decoder framework, along with attention, to generate abstractive summaries from extractive ones. This model works on a mix of extractive summaries and inter-sentential similarity embeddings and learns to map them to corresponding abstractive summaries. On experimentation, we discovered that our model addresses the saliency factor of summarization, an aspect overlooked by previous works. Further experiments show that our model develops coherent abstractive summaries, validated by high BLEU and ROUGE scores.

Deep Learning Based Abstractive Text Summarization: Approaches, Datasets, Evaluation Measures, and Challenges

Mathematical Problems in Engineering

In recent years, the volume of textual data has rapidly increased, which has generated a valuable resource for extracting and analysing information. To retrieve useful knowledge within a reasonable time period, this information must be summarised. This paper reviews recent approaches for abstractive text summarisation using deep learning models. In addition, existing datasets for training and validating these approaches are reviewed, and their features and limitations are presented. The Gigaword dataset is commonly employed for single-sentence summary approaches, while the Cable News Network (CNN)/Daily Mail dataset is commonly employed for multisentence summary approaches. Furthermore, the measures that are utilised to evaluate the quality of summarisation are investigated, and Recall-Oriented Understudy for Gisting Evaluation 1 (ROUGE1), ROUGE2, and ROUGE-L are determined to be the most commonly applied metrics. The challenges that are encountered during the summarisation process ...

DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization

ArXiv, 2021

Transformer-based models have achieved state-of-the-art performance on short text summarization. However, they still struggle with long-input summarization. In this paper, we present a new approach for long-input summarization: Dynamic Latent Extraction for Abstractive Summarization. We jointly train an extractor with an abstractor and treat the extracted text snippets as the latent variable. We propose extractive oracles to provide the extractor with a strong learning signal. We introduce consistency loss, which encourages the extractor to approximate the averaged dynamic weights predicted by the generator. We conduct extensive tests on two long-input summarization datasets, GovReport (document) and QMSum (dialogue). Our model significantly outperforms the current state-of-theart, including a 6.21 ROUGE-2 improvement on GovReport and a 2.13 ROUGE-1 improvement on QMSum. Further analysis shows that the dynamic weights make our generation process highly interpretable. Our code will b...

Scaling Up Query-Focused Summarization to Meet Open-Domain Question Answering

arXiv (Cornell University), 2021

Query-focused summarization (QFS) requires generating a textual summary given a query using a set of relevant documents. However, in practice, such relevant documents are not readily available but should be first retrieved from a document collection. Therefore, we show how to extend this task to make it more realistic. Thereby the task setup also resembles the settings of the open-domain question answering task, where the answer is a summary of the top-retrieved documents. To address this extended task, we combine passage retrieval with text generation to produce the summary of the retrieved passages given the input query. We demonstrate the first evaluation results on the proposed task and show that a few samples are sufficient to fine-tune a large generative model with retrieved passages.

Generating Topic-Oriented Summaries Using Neural Attention

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Summarizing a document requires identifying the important parts of the document with an objective of providing a quick overview to a reader. However, a long article can span several topics and a single summary cannot do justice to all the topics. Further, the interests of readers can vary and the notion of importance can change across them. Existing summarization algorithms generate a single summary and are not capable of generating multiple summaries tuned to the interests of the readers. In this paper, we propose an attention based RNN framework to generate multiple summaries of a single document tuned to different topics of interest. Our method outperforms existing baselines and our results suggest that the attention of generative networks can be successfully biased to look at sentences relevant to a topic and effectively used to generate topic-tuned summaries.

Neural sentence fusion for diversity driven abstractive multi-document summarization

Computer Speech & Language, 2019

The lack of multi-document based models and the inaccuracy in representing multiple long documents into a fixed size vector inspired us to solve abstractive multi-document summarization. Also, there is lack of good multi-document based human-authored datasets to train any encoder-decoder models. To overcome this, we have designed complementary models for two different tasks such as sentence clustering and neural sentence fusion. In this work, we minimize the risk of producing incorrect fact by encoding a related set of sentences as an input to the encoder. We have applied our complementary models to implement a full abstractive multi-document summarization system which simultaneously considers importance, coverage, and diversity under a desired length limit. We conduct extensive experiments for all the proposed models which bring significant improvements over the state-of-the-art methods across different evaluation metrics.

Summarizing ETDs with deep learning

2019

Inspired by the millions of Electronic Theses and Dissertations (ETDs) openly available online, we describe a novel use of ETDs as data for text summarization. We use a large corpus of ETDs to evaluate techniques for generating abstractive summaries with deep learning. Using an extensive ETD collection of over 30,000 doctoral dissertations and master's theses, we examine the quality of state-of-the-art deep learning summarization technologies when applied to an ETD corpus. Deep learning requires a large set of training data to produce satisfactory results. Finding suitable training data is especially difficult due to the widespread use of domain-specific jargon in ETDs, coupled with the wide-ranging breadth of subject matter contained in an ETD corpus. To overcome this significant limitation, we demonstrate the potential of transfer learning on automatic summarization of ETD chapters. We apply several combinations of deep learning models and training data to the ETD chapter summ...