Interpreting Text Classifiers by Learning Context-sensitive Influence of Words (original) (raw)
Related papers
On the Lack of Robust Interpretability of Neural Text Classifiers
2021
With the ever-increasing complexity of neural language models, practitioners have turned to methods for understanding the predictions of these models. One of the most welladopted approaches for model interpretability is feature-based interpretability, i.e., ranking the features in terms of their impact on model predictions. Several prior studies have focused on assessing the fidelity of feature-based interpretability methods, i.e., measuring the impact of dropping the top-ranked features on the model output. However, relatively little work has been conducted on quantifying the robustness of interpretations. In this work, we assess the robustness of interpretations of neural text classifiers, specifically, those based on pretrained Transformer encoders, using two randomization tests. The first compares the interpretations of two models that are identical except for their initializations. The second measures whether the interpretations differ between a model with trained parameters an...
Hierarchical Interpretation of Neural Text Classification
2022
Recent years have witnessed increasing interests in developing interpretable models in Natural Language Processing (NLP). Most existing models aim at identifying input features such as words or phrases important for model predictions. Neural models developed in NLP however often compose word semantics in a hierarchical manner. Interpretation by words or phrases only thus cannot faithfully explain model decisions. This paper proposes a novel Hierarchical INTerpretable neural text classifier, called Hint, which can automatically generate explanations of model predictions in the form of label-associated topics in a hierarchical manner. Model interpretation is no longer at the word level, but built on topics as the basic semantic unit. Experimental results on both review datasets and news datasets show that our proposed approach achieves text classification results on par with existing state-of-the-art text classifiers, and generates interpretations more faithful to model predictions an...
Explaining Sentiment Classification
Interspeech 2019
This paper presents a novel 1-D sentiment classifier trained on the benchmark IMDB dataset. The classifier is a 1-D convolutional neural network with repeated convolution and max pooling layers. The main contribution of this work is the demonstration of a deconvolution technique for 1-D convolutional neural networks that is agnostic to specific architecture types. This deconvolution technique enables text classification to be explained, a feature that is important for NLP-based decision support systems, as well as being an invaluable diagnostic tool.
Interpretation of Sentiment Analysis with Human-in-the-Loop
Human-in-the-Loop has been receiving special attention from the data science and machine learning community. It is essential to realize the advantages of human feedback and the pressing need for manual annotation to improve machine learning performance. Recent advancements in natural language processing (NLP) and machine learning have created unique challenges and opportunities for digital humanities research. In particular, there are ample opportunities for NLP and machine learning researchers to analyze data from literary texts and use these complex source texts to broaden our understanding of human sentiment using the human-in-the-loop approach. This paper presents our understanding of how human annotators differ from machine annotators in sentiment analysis tasks and how these differences can contribute to designing systems for the "human in the loop" sentiment analysis in complex, unstructured texts. We further explore the challenges and benefits of the human-machine collaboration for sentiment analysis using a case study in Greek tragedy and address some open questions about collaborative annotation for sentiments in literary texts. We focus primarily on (i) an analysis of the challenges in sentiment analysis tasks for humans and machines, and (ii) whether consistent annotation results are generated from multiple human annotators and multiple machine annotators. For human annotators, we have used a survey-based approach with about 60 college students. We have selected six popular sentiment analysis tools for machine annotators, including VADER, CoreNLP's sentiment annotator, TextBlob, LIME, Glove+LSTM, and RoBERTa. We have conducted a qualitative and quantitative evaluation with the human-in-the-loop approach and confirmed our observations on sentiment tasks using the Greek tragedy case study.
ShufText: A Simple Black Box Approach to Evaluate the Fragility of Text Classification Models
2021
Text classification is the most basic natural language processing task. It has a wide range of applications ranging from sentiment analysis to topic classification. Recently, deep learning approaches based on CNN, LSTM, and Transformers have been the de facto approach for text classification. In this work, we highlight a common issue associated with these approaches. We show that these systems are over-reliant on the important words present in the text that are useful for classification. With limited training data and discriminative training strategy, these approaches tend to ignore the semantic meaning of the sentence and rather just focus on keywords or important n-grams. We propose a simple black box technique ShutText to present the shortcomings of the model and identify the over-reliance of the model on keywords. This involves randomly shuffling the words in a sentence and evaluating the classification accuracy. We see that on common text classification datasets there is very l...
IEEE Access, 2019
Sentiment Analysis (SA) is focused on mining opinion (identification and classification) from unstructured text data such as product reviews or microblogs. It is widely used for brand reviews, political campaigns, marketing analysis or gaining feedback from customers. One of the prominent approaches for SA is using supervised machine learning (SML), an algorithm that uses datasets with defined class labels based on mathematical learning from a training dataset. While the results are promising especially with indomain sentiments, there is no guarantee the model provides the same performance against real time data due to the diversity of new data. In addition, previous studies suggest the result of SML decrease when applied to cross-domain datasets because new features are appeared in different domains. So far, studies in SA emphasise the improvement of the sentiment result whereas there is little discussion focusing on how to detect the degradation of performance for the proposed model. Therefore, we provide a method known as Contextual Analysis (CA), a mechanism that constructs a relationship between words and sources that is constructed in a tree structure identified as Hierarchical Knowledge Tree (HKT). Then, Tree Similarity Index (TSI) and Tree Differences Index (TDI), a formula generate from tree structure are proposed to find similarity as well as changes between train and actual dataset. The regression analysis of datasets reveals that there is a highly significant positive relationship between TSI and SML accuracies. As a result, the prediction model created indicated estimation error within 2.75 to 3.94 and 2.30 for 3.51 for average absolute differences. Moreover, this method also can cluster sentiment words into positive and negative without having any linguistics resources used and at the same time capturing changes of sentiment words when a new dataset is applied. INDEX TERMS Text analytics, sentiment analysis, contextual analysis, supervised machine learning.
Using ``Annotator Rationales'' to Improve Machine Learning for Text Categorization
2007
We propose a new framework for supervised machine learning. Our goal is to learn from smaller amounts of supervised training data, by collecting a richer kind of training data: annotations with "rationales." When annotating an example, the human teacher will also highlight evidence supporting this annotation-thereby teaching the machine learner why the example belongs to the category. We provide some rationale-annotated data and present a learning method that exploits the rationales during training to boost performance significantly on a sample task, namely sentiment classification of movie reviews. We hypothesize that in some situations, providing rationales is a more fruitful use of an annotator's time than annotating more examples.
Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
The state-of-the-art natural language processing models have raised the bar for excellent performance on a variety of tasks in recent years. However, concerns are rising over their primitive sensitivity to distribution biases that reside in the training and testing data. This issue hugely impacts the performance of the models when exposed to out-of-distribution and counterfactual data. The root cause seems to be that many machine learning models are prone to learn the shortcuts, modelling simple correlations rather than more fundamental and general relationships. As a result, such text classifiers tend to perform poorly when a human makes minor modifications to the data, which raises questions regarding their robustness. In this paper, we employ a rule-based architecture called Tsetlin Machine (TM) that learns both simple and complex correlations by ANDing features and their negations. As such, it generates explainable AND-rules using negated and non-negated reasoning. Here, we expl...