XAI-Based Reinforcement Learning Approach for Text Summarization of Social IoT-Based Content (original) (raw)
Related papers
Recent Approaches for Text Summarization Using Machine Learning & LSTM0
Journal of Big Data, 2021
Nowadays, data is very rapidly increasing in every domain such as social media, news, education, banking, etc. Most of the data and information is in the form of text. Most of the text contains little invaluable information and knowledge with lots of unwanted contents. To fetch this valuable information out of the huge text document, we need summarizer which is capable to extract data automatically and at the same time capable to summarize the document, particularly textual text in novel document, without losing its any vital information. The summarization could be in the form of extractive and abstractive summarization. The extractive summarization includes picking sentences of high rank from the text constructed by using sentence and word features and then putting them together to produced summary. An abstractive summarization is based on understanding the key ideas in the given text and then expressing those ideas in pure natural language. The abstractive summarization is the latest problem area for NLP (natural language processing), ML (Machine Learning) and NN (Neural Network) In this paper, the foremost techniques for automatic text summarization processes are defined. The different existing methods have been reviewed. Their effectiveness and limitations are described. Further the novel approach based on Neural Network and LSTM has been discussed. In Machine Learning approach the architecture of the underlying concept is called Encoder-Decoder.
Summarizing Online Conversations: A Machine Learning Approach
Summarization has emerged as an increasingly useful approach to tackle the problem of information overload. Extracting information from online conversations can be of very good commercial and educational value. But majority of this information is present as noisy unstructured text making traditional document summarization techniques difficult to apply. In this paper, we propose a novel approach to address the problem of conversation summarization. We develop an automatic text summarizer which extracts sentences from the conversation to form a summary. Our approach consists of three phases. In the first phase, we prepare the dataset for usage by correcting spellings and segmenting the text. In the second phase, we represent each sentence by a set of predefined features. These features capture the statistical, linguistic and sentimental aspects along with the dialogue structure of the conversation. Finally, in the third phase we use a machine learning algorithm to train the summarizer on the set of feature vectors. Experiments performed on conversations taken from the technical domain show that our system significantly outperforms the baselines on ROUGE F-scores.
2019
This master thesis opens with a description of several text summarization methods based on machine learning approaches inspired by reinforcement learning. While in many cases Maximum Likelihood Estimation (MLE) approaches work well for text summarization, they tend to suffer from poor generalization. We show that techniques which expose the model to more opportunities to learn from data tend to generalize better and generate summaries with less lead bias. In our experiments we show that out of the box these new models do not perform significantly better than MLE when evaluated using Rouge, however do possess interesting properties which may be used to assemble more sophisticated and better performing summarization systems. The main theme of the thesis is getting machine learning models to generalize better using ideas from reinforcement learning. We develop a new labeling scheme inspired by Reward Augmented Maximum Likelihood (RAML) methods developed originally for the machine trans...
Abstractive Text Summarization using Deep Learning
International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2023
The number of text records has increased dramatically in recent years, and social media structures, including websites and mobile apps, will generate a huge amount of statistics about non-text content. structure, including blogs, discussion forum posts, technical guides, and more. Statistics, which constitute human behavior and intuitive thinking, consist of many records that are relatively difficult to manage due to their large number and various factors. However, the demand for statistics summarizing textual content is increasing. Text summarization is a way of analyzing unstructured text and converting it into meaningful statistics for evaluation that will produce the necessary number of useful records. This study describes a deep learning method for effectively summarizing textual content. As a result, the reader receives a condensed and focused model of the unique textual content.
Reinforced Abstractive Text Summarization With Semantic Added Reward
IEEE Access, 2021
Text summarization is an important task in natural language processing (NLP). Neural summary models summarize information by understanding and rewriting documents through the encoderdecoder structure. Recent studies have sought to overcome the bias that cross-entropy-based learning methods can have through reinforcement learning (RL)-based learning methods or the problem of failing to learn optimized for metrics. However, the ROUGE metric with only n-gram matching is not a perfect solution. The purpose of this study is to improve the quality of the summary statement by proposing a reward function used in text summarization based on RL. We propose ROUGE-SIM and ROUGE-WMD, modified functions of the ROUGE function. ROUGE-SIM enables meaningfully similar words, in contrast to ROUGE-L. ROUGE-WMD is a function adding semantic similarity to ROUGE-L. The semantic similarity between articles and summary text was computed using Word Mover's Distance (WMD) methodology. Our model with two proposed reward functions demonstrated superior performance on ROUGE-1, ROUGE-2, and ROUGE_L than on ROUGE-L as a reward function. Our two models, ROUGE-SIM and ROUGE-WMD, scored 0.418 and 0.406 for ROUGE-L, respectively, for the Gigaword dataset. The two reward functions outperformed ROUGE-L even in the abstractiveness and grammatical aspects.
Short Updates- Machine Learning Based News Summarizer
Journal of Advanced College of Engineering and Management
Automated Text Summarization is becoming important due to the vast amount of data being generated. Manual processing of documents is tedious, mostly due to the absence of standards. Therefore, there is a need for a mechanism to reduce text size, structure it, and make it readable for users. Natural Language Processing (NLP) is critical for analyzing large amounts of unstructured, text-heavy data. This project aims to address concerns with extractive and abstractive text summarization by introducing a new neural network model that deals with repetitive and incoherent phrases in longer documents. The model incorporates a novel Seq2Seq architecture that enhances the standard attentional model with an intra-attention mechanism. Additionally, a new training method that combines supervised word prediction and reinforcement learning is employed. The model utilizes a hybrid pointer-generator network, which distinguishes it from the standard encoder-decoder model. This approach produces high...
Unsupervised Sentence Enhancement for Automatic Summarization
We present sentence enhancement as a novel technique for text-to-text generation in abstractive summarization. Compared to extraction or previous approaches to sentence fusion, sentence enhancement increases the range of possible summary sentences by allowing the combination of dependency subtrees from any sentence from the source text. Our experiments indicate that our approach yields summary sentences that are competitive with a sentence fusion baseline in terms of content quality, but better in terms of grammaticality, and that the benefit of sentence enhancement relies crucially on an event coreference resolution algorithm using distributional semantics. We also consider how text-to-text generation approaches to summarization can be extended beyond the source text by examining how human summary writers incorporate source-text-external elements into their summary sentences.
Abstractive Text Summarization Based on Deep Learning and Semantic Content Generalization
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019
This work proposes a novel framework for enhancing abstractive text summarization based on the combination of deep learning techniques along with semantic data transformations. Initially, a theoretical model for semantic-based text generalization is introduced and used in conjunction with a deep encoder-decoder architecture in order to produce a summary in generalized form. Subsequently, a methodology is proposed which transforms the aforementioned generalized summary into human-readable form, retaining at the same time important informational aspects of the original text and addressing the problem of out-of-vocabulary or rare words. The overall approach is evaluated on two popular datasets with encouraging results.
A comprehensive summary informativeness evaluation for RST-based summarization methods
2009
Motivated by governmental, commercial and academic interests, automatic text summarization area has experienced an increasing number of researches and products, which led to a countless number of summarization methods. In this paper, we present a comprehensive comparative evaluation of the main automatic text summarization methods based on Rhetorical Structure Theory (RST), claimed to be among the best ones. Additionally, we test machine learning techniques trained on RST features. We also compare our results to superficial summarizers, which belong to a paradigm with severe limitations, and to hybrid methods, combining RST and superficial methods. Our results show that all RST methods have similar overall performance and that they outperform the superficial methods. In terms of precision, the method we propose is the best one, while it competes with other ones for coverage. Machine learning techniques achieved high accuracy in the classification of text segments worth of being in t...
2017
We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to state-of-the-art. Our model has the additional advantage of being very interpretable, since it allows visualization of its predictions broken up by abstract features such as information content, salience and novelty. Another novel contribution of our work is abstractive training of our extractive model that can train on human generated reference summaries alone, eliminating the need for sentence-level extractive labels.