STEM: Unsupervised STructural EMbedding for Stance Detection (original) (raw)
Related papers
Emergent: a novel data-set for stance classification
We present Emergent, a novel data-set derived from a digital journalism project for rumour debunking. The data-set contains 300 rumoured claims and 2,595 associated news articles, collected and labelled by journalists with an estimation of their veracity (true, false or unverified). Each associated article is summarized into a headline and labelled to indicate whether its stance is for, against, or observing the claim, where observing indicates that the article merely repeats the claim. Thus, Emergent provides a real-world data source for a variety of natural language processing tasks in the context of fact-checking. Further to presenting the dataset, we address the task of determining the article headline stance with respect to the claim. For this purpose we use a logistic regression classifier and develop features that examine the headline and its agreement with the claim. The accuracy achieved was 73% which is 26% higher than the one achieved by the Excitement Open Platform (Magnini et al., 2014).
Automatic Stance Detection Using End-to-End Memory Networks
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
We present a novel end-to-end memory network for stance detection, which jointly (i) predicts whether a document agrees, disagrees, discusses or is unrelated with respect to a given target claim, and also (ii) extracts snippets of evidence for that prediction. The network operates at the paragraph level and integrates convolutional and recurrent neural networks, as well as a similarity matrix as part of the overall architecture. The experimental evaluation on the Fake News Challenge dataset shows state-of-the-art performance.
The Fake News Challenge: Stance Detection using Traditional Machine Learning Approaches
2018
Fake news has caused sensation lately, and this term is the Collins Dictionary Word of the Year 2017. As the news are disseminated very fast in the era of social networks, an automated fact checking tool becomes a requirement. However, a fully automated tool that judges a claim to be true or false is always limited in functionality, accuracy and understandability. Thus, an alternative suggestion is to collaborate a number of analysis tools in one platform which help human fact checkers and normal users produce better judging based on many aspects. A stance detection tool is a first stage of an online challenge that aims to detect fake news. The goal is to determine the relative perspective of a news article towards its title. In this paper, we tackle the challenge of stance detection by utilizing traditional machine learning algorithms along with problem specific feature engineering. Our results show that these models outperform the best outcomes of the participating solutions which mainly use deep learning models.
Combining Neural, Statistical and External Features for Fake News Stance Identification
Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18, 2018
Identifying the veracity of a news article is an interesting problem while automating this process can be a challenging task. Detection of a news article as fake is still an open question as it is contingent on many factors which the current state-of-the-art models fail to incorporate. In this paper, we explore a subtask to fake news identification, and that is stance detection. Given a news article, the task is to determine the relevance of the body and its claim. We present a novel idea that combines the neural, statistical and external features to provide an efficient solution to this problem. We compute the neural embedding from the deep recurrent model, statistical features from the weighted n-gram bag-of-words model and hand crafted external features with the help of feature engineering heuristics. Finally, using deep neural layer all the features are combined, thereby classifying the headline-body news pair as agree, disagree, discuss, or unrelated. Through extensive experiments, we find that the proposed model outperforms all the state-of-the-art techniques including the submissions to the fake news challenge.
Stance Detection in Fake News A Combined Feature Representation
Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), 2018
With the uncontrolled increasing of fake news and rumors over the Web, different approaches have been proposed to address the problem. In this paper, we present an approach that combines lexical, word embeddings and n-gram features to detect the stance in fake news. Our approach has been tested on the Fake News Challenge (FNC-1) dataset. Given a news title-article pair, the FNC-1 task aims at determining the relevance of the article and the title. Our proposed approach has achieved an accurate result (59.6 % Macro F1) that is close to the state-of-the-art result with 0.013 difference using a simple feature representation. Furthermore, we have investigated the importance of different lexicons in the detection of the classification labels.
Mama Edha at SemEval-2017 Task 8: Stance Classification with CNN and Rules
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 2017
For the competition SemEval-2017 we investigated the possibility of performing stance classification (support, deny, query or comment) for messages in Twitter conversation threads related to rumours. Stance classification is interesting since it can provide a basis for rumour veracity assessment. Our ensemble classification approach of combining convolutional neural networks with both automatic rule mining and manually written rules achieved a final accuracy of 74.9% on the competition's test data set for Task 8A. To improve classification we also experimented with data relabeling and using the grammatical structure of the tweet contents for classification.
2020
This paper presents our submission to the SardiStance 2020 shared task, describing the architecture used for Task A and Task B. While our submission for Task A did not exceed the baseline, retraining our model using all the training tweets, showed promising results leading to (f-avg 0.601) using bidirectional LSTM with BERT multilingual embedding for Task A. For our submission for Task B, we ranked 6th (f-avg 0.709). With further investigation, our best experimented settings increased performance from (f-avg 0.573) to (f-avg 0.733) with same architecture and parameter settings and after only incorporating social interaction features -- highlighting the impact of social interaction on the model's performance.
Stance detection with BERT embeddings for credibility analysis of information on social media
PeerJ Computer Science
The evolution of electronic media is a mixed blessing. Due to the easy access, low cost, and faster reach of the information, people search out and devour news from online social networks. In contrast, the increasing acceptance of social media reporting leads to the spread of fake news. This is a minacious problem that causes disputes and endangers the societal stability and harmony. Fake news spread has gained attention from researchers due to its vicious nature. proliferation of misinformation in all media, from the internet to cable news, paid advertising and local news outlets, has made it essential for people to identify the misinformation and sort through the facts. Researchers are trying to analyze the credibility of information and curtail false information on such platforms. Credibility is the believability of the piece of information at hand. Analyzing the credibility of fake news is challenging due to the intent of its creation and the polychromatic nature of the news. In...
Stance Classification through Proximity-based Community Detection
Proceedings of the 29th on Hypertext and Social Media, 2018
Numerous domains have interests in studying the viewpoints expressed online, be it for marketing, cybersecurity, or research purposes with the rise of computational social sciences. Current stance detection models are usually grounded on the specificities of some social platforms. This rigidity is unfortunate since it does not allow the integration of the multitude of signals informing effective stance detection. We propose the SCSD model, or Sequential Community-based Stance Detection model, a semi-supervised ensemble algorithm which considers these signals by modeling them as a multi-layer graph representing proximities between profiles. We use a handful of seed profiles, for whom we know the stance, to classify the rest of the profiles by exploiting like-minded communities. These communities represent profiles close enough to assume they share a similar stance on a given subject. Using datasets from two different social platforms, containing two to five stances, we show that by combining several types of proximity we can achieve excellent results. Moreover, we compare the proximities to find those which convey useful information in term of stance detection.
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020
This paper presents our submission to the SardiStance 2020 shared task, describing the architecture used for Task A and Task B. While our submission for Task A did not exceed the baseline, retraining our model using all the training tweets, showed promising results leading to (favg 0.601) using bidirectional LSTM with BERT multilingual embedding for Task A. For our submission for Task B, we ranked 6th (f-avg 0.709). With further investigation, our best experimented settings increased performance from (f-avg 0.573) to (f-avg 0.733) with same architecture and parameter settings and after only incorporating social interaction features-highlighting the impact of social interaction on the model's performance.