maryam saleki | Fordham University (original) (raw)

Papers by maryam saleki

Research paper thumbnail of Vio-Lens: A Novel Dataset of Annotated Social Network Posts Leading to Different Forms of Communal Violence and its Evaluation

This paper presents a computational approach for creating a dataset on communal violence in the c... more This paper presents a computational approach for creating a dataset on communal violence in the context of Bangladesh and West Bengal of India and benchmark evaluation. In recent years, social media has been used as a weapon by factions of different religions and backgrounds to incite hatred, resulting in physical communal violence and causing death and destruction. To prevent such abusive use of online platforms, we propose a framework for classifying online posts using an adaptive question-based approach. We collected more than 168,000 YouTube comments from a set of manually selected videos known for inciting violence in Bangladesh and West Bengal. Using both unsupervised and later semi-supervised topic modeling methods on those unstructured data, we discovered the major word clusters to interpret the related topics of peace and violence. Topic words were later used to select 20,142 posts related to peace and violence of which we annotated a total of 6,046 posts. Finally, we applied different modeling techniques based on linguistic features, and sentence transformers to benchmark the labeled dataset with the best-performing model reaching ∼71% macro F1 score.

Research paper thumbnail of Risk-calibrated evidential classifiers

Research paper thumbnail of Not all Mistakes are Equal

In many tasks, classifiers play a fundamental role in the way an agent behaves. Most rational age... more In many tasks, classifiers play a fundamental role in the way an agent behaves. Most rational agents collect sensor data from the environment, classify it, and act based on that classification. Recently, deep neural networks (DNNs) have become the dominant approach to develop classifiers due to their excellent performance. When training and evaluating the performance of DNNs, it is normally assumed that the cost of all misclassification errors are equal. However, this is unlikely to be true in practice. Incorrect classification predictions can cause an agent to take inappropriate actions. The costs of these actions can be asymmetric, vary from agent-to-agent, and depend on context. In this paper, we discuss the importance of considering risk and uncertainty quantification together to reduce agents’ cost of making misclassifications using deep classifiers.

Research paper thumbnail of Misclassification Risk and Uncertainty Quantification in Deep Classifiers

2021 IEEE Winter Conference on Applications of Computer Vision (WACV)

In this paper, we propose risk-calibrated evidential deep classifiers to reduce the costs associa... more In this paper, we propose risk-calibrated evidential deep classifiers to reduce the costs associated with classification errors. We use two main approaches. The first is to develop methods to quantify the uncertainty of a classifier's predictions and reduce the likelihood of acting on erroneous predictions. The second is a novel way to train the classifier such that erroneous classifications are biased towards less risky categories. We combine these two approaches in a principled way. While doing this, we extend evidential deep learning with pignistic probabilities, which are used to quantify uncertainty of classification predictions and model rational decision making under uncertainty. We evaluate the performance of our approach on several image classification tasks. We demonstrate that our approach allows to (i) incorporate misclassification cost while training deep classifiers, (ii) accurately quantify the uncertainty of classification predictions, and (iii) simultaneously learn how to make classification decisions to minimize expected cost of classification errors.

Research paper thumbnail of Uncertainty-Aware Deep Classifiers Using Generative Models

Deep neural networks are often ignorant about what they do not know and overconfident when they m... more Deep neural networks are often ignorant about what they do not know and overconfident when they make uninformed predictions. Some recent approaches quantify classification uncertainty directly by training the model to output high uncertainty for the data samples close to class boundaries or from the outside of the training distribution. These approaches use an auxiliary data set during training to represent out-of-distribution samples. However, selection or creation of such an auxiliary data set is non-trivial, especially for high dimensional data such as images. In this work we develop a novel neural network model that is able to express both aleatoric and epistemic uncertainty to distinguish decision boundary and out-of-distribution regions of the feature space. To this end, variational autoencoders and generative adversarial networks are incorporated to automatically generate out-of-distribution exemplars for training. Through extensive analysis, we demonstrate that the proposed ...

Research paper thumbnail of Vio-Lens: A Novel Dataset of Annotated Social Network Posts Leading to Different Forms of Communal Violence and its Evaluation

This paper presents a computational approach for creating a dataset on communal violence in the c... more This paper presents a computational approach for creating a dataset on communal violence in the context of Bangladesh and West Bengal of India and benchmark evaluation. In recent years, social media has been used as a weapon by factions of different religions and backgrounds to incite hatred, resulting in physical communal violence and causing death and destruction. To prevent such abusive use of online platforms, we propose a framework for classifying online posts using an adaptive question-based approach. We collected more than 168,000 YouTube comments from a set of manually selected videos known for inciting violence in Bangladesh and West Bengal. Using both unsupervised and later semi-supervised topic modeling methods on those unstructured data, we discovered the major word clusters to interpret the related topics of peace and violence. Topic words were later used to select 20,142 posts related to peace and violence of which we annotated a total of 6,046 posts. Finally, we applied different modeling techniques based on linguistic features, and sentence transformers to benchmark the labeled dataset with the best-performing model reaching ∼71% macro F1 score.

Research paper thumbnail of Risk-calibrated evidential classifiers

Research paper thumbnail of Not all Mistakes are Equal

In many tasks, classifiers play a fundamental role in the way an agent behaves. Most rational age... more In many tasks, classifiers play a fundamental role in the way an agent behaves. Most rational agents collect sensor data from the environment, classify it, and act based on that classification. Recently, deep neural networks (DNNs) have become the dominant approach to develop classifiers due to their excellent performance. When training and evaluating the performance of DNNs, it is normally assumed that the cost of all misclassification errors are equal. However, this is unlikely to be true in practice. Incorrect classification predictions can cause an agent to take inappropriate actions. The costs of these actions can be asymmetric, vary from agent-to-agent, and depend on context. In this paper, we discuss the importance of considering risk and uncertainty quantification together to reduce agents’ cost of making misclassifications using deep classifiers.

Research paper thumbnail of Misclassification Risk and Uncertainty Quantification in Deep Classifiers

2021 IEEE Winter Conference on Applications of Computer Vision (WACV)

In this paper, we propose risk-calibrated evidential deep classifiers to reduce the costs associa... more In this paper, we propose risk-calibrated evidential deep classifiers to reduce the costs associated with classification errors. We use two main approaches. The first is to develop methods to quantify the uncertainty of a classifier's predictions and reduce the likelihood of acting on erroneous predictions. The second is a novel way to train the classifier such that erroneous classifications are biased towards less risky categories. We combine these two approaches in a principled way. While doing this, we extend evidential deep learning with pignistic probabilities, which are used to quantify uncertainty of classification predictions and model rational decision making under uncertainty. We evaluate the performance of our approach on several image classification tasks. We demonstrate that our approach allows to (i) incorporate misclassification cost while training deep classifiers, (ii) accurately quantify the uncertainty of classification predictions, and (iii) simultaneously learn how to make classification decisions to minimize expected cost of classification errors.

Research paper thumbnail of Uncertainty-Aware Deep Classifiers Using Generative Models

Deep neural networks are often ignorant about what they do not know and overconfident when they m... more Deep neural networks are often ignorant about what they do not know and overconfident when they make uninformed predictions. Some recent approaches quantify classification uncertainty directly by training the model to output high uncertainty for the data samples close to class boundaries or from the outside of the training distribution. These approaches use an auxiliary data set during training to represent out-of-distribution samples. However, selection or creation of such an auxiliary data set is non-trivial, especially for high dimensional data such as images. In this work we develop a novel neural network model that is able to express both aleatoric and epistemic uncertainty to distinguish decision boundary and out-of-distribution regions of the feature space. To this end, variational autoencoders and generative adversarial networks are incorporated to automatically generate out-of-distribution exemplars for training. Through extensive analysis, we demonstrate that the proposed ...