Sattar Hashemi - Profile on Academia.edu (original) (raw)

Papers by Sattar Hashemi

International Journal of Information Science and Management, 2020

Vast majority of data mining algorithms have been designed to work on centralized data, unfortuna... more Vast majority of data mining algorithms have been designed to work on centralized data, unfortunately however, almost all of nowadays data sets are distributed both geographically and conceptually. Due to privacy and computation cost, centralizing distributed data sets before analyzing them is undoubtedly impractical. In this paper, we present a framework for clustering distributed data which takes into account privacy and computation cost. To do that, we remove uncertain instances and just send the label of the other instances to the central location. To remove the uncertain instances, we develop a new instance weighting method based on fuzzy and rough set theory. The achieved results on well-known data verify effectiveness of the proposed method compared to previous works.

Many data mining applications, ranging from Spam filtering to intrusion detection, are forced wit... more Many data mining applications, ranging from Spam filtering to intrusion detection, are forced with active adversaries. Adversary deliberately manipulate data in order to reduce the classifier's accuracy, in all these applications, initially successful classifiers will degrade easily. In this paper we model the interaction between the adversary and the classifier as a two person sequential non cooperative Stackelberg game and analyze the payoff when there is a leader and a follower. We then proceed to model the interaction as an optimization problem and solve it with evolutionary strategy. Our experimental results are promising; since they show that our approach improves accuracy spam detection on several real world data sets.

Iranian Journal of Science and Technology-Transactions of Electrical Engineering, 2015

Transfer learning allows the knowledge transference from the source (training dataset) to target ... more Transfer learning allows the knowledge transference from the source (training dataset) to target (test dataset) domain. Feature selection for transfer learning (f-MMD) is a simple and effective transfer learning method, which tackles the domain shift problem. f-MMD has good performance on small-sized datasets, but it suffers from two major issues: i) computational efficiency and predictive performance of f-MMD is challenged by the application domains with large number of examples and features, and ii) f-MMD considers the domain shift problem in fully unsupervised manner. In this paper, we propose a new approach to break down the large initial set of samples into a number of small-sized random subsets, called samplesets. Moreover, we present a feature weighting and instance clustering approach, which categorizes the original feature samplesets into the variant and invariant features. In domain shift problem, invariant features have a vital role in transferring knowledge across domain...

Since dealing with high dimensional data is computationally complex and sometimes even intractabl... more Since dealing with high dimensional data is computationally complex and sometimes even intractable, recently several feature reduction methods have been developed to reduce the dimensionality of the data in order to simplify the calculation analysis in various applications such as text categorization, signal processing, image retrieval and gene expressions among many others. Among feature reduction techniques, feature selection is one of the most popular methods due to the preservation of the original meaning of features. However, most of the current feature selection methods do not have a good performance when fed on imbalanced data sets which are pervasive in real world applications. In this paper, we propose a new unsupervised feature selection method attributed to imbalanced data sets, which will remove redundant features from the original feature space based on the distribution of features. To show the effectiveness of the proposed method, popular feature selection methods have...

International Journal of Machine Learning and Cybernetics

Ransomware, a malware designed to encrypt data for ransom payments, is a threat to fog layer node... more Ransomware, a malware designed to encrypt data for ransom payments, is a threat to fog layer nodes as such nodes typically contain considerably amount of sensitive data. The capability to efficiently hunt abnormalities relating to ransomware activities is crucial in timely detection of ransomware. In this paper, we present our Deep Ransomware Threat Hunting and Intelligence System (DRTHIS) to distinguish ransomware from goodware and identify their families. Specifically, DRTHIS utilizes Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN), two deep learning techniques, for classification using the softmax algorithm. We then use 220 Locky, 220 Cerber and 220 TeslaCrypt ransomware samples, and 219 goodware samples, to train DRTHIS. Findings from our evaluations demonstrate that the proposed system achieves an F-measure of 99.6% with a true positive rate of 97.2% in the classification of ransomware instances. Additionally, we demonstrate that DRTHIS is capable of detect...

IEEE Access

This research proposes a novel unsupervised domain adaptation algorithm for cross-domain visual r... more This research proposes a novel unsupervised domain adaptation algorithm for cross-domain visual recognition. Distance Correlation-based Domain Adaptation or DCDA algorithm is developed by a correlation measure, called distance correlation. DCDA exploits both the statistical and geometrical properties of the data while embedding the both domain instances to a latent feature space. Unlike many proposed algorithms in the literature that utilize the source domain labels to learn pseudo labels, DCDA further exploits the available information in the source domain labels to discover an appropriate projection operator. The implementation of the proposed DCDA algorithm is easy, and it has a closed-form solution. Our experiments and analyses of the results over a wide variety of benchmark domain adaptation data sets indicate that DCDA has significantly better results in comparison with other state-of-the-art approaches in unsupervised domain adaptation and deep learning literature. INDEX TERMS Transfer learning, unsupervised domain adaptation (UDA), distance correlation (dCor), maximum mean discrepancy (MMD).

TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES

Learning invariant features across domains is of vital importance to unsupervised domain adaptati... more Learning invariant features across domains is of vital importance to unsupervised domain adaptation, where classifiers trained on the training examples (source domain) need to adapt to a different set of test examples (target domain) in which no labeled examples are available. In this paper, we propose a novel approach to find the invariant features in the original space and transfer the knowledge across domains. We extract invariant features of input data by a kernel-based feature weighting approach, which exploits distribution difference and instance clustering to find desired features. The proposed method is called the kernel-based feature weighting (KFW) approach and benefits from the maximum mean discrepancy to measure the difference between domains. KFW uses condensed clusters in the reduced domains, the domains that do not contain variant features, to enhance the classification performance. Simultaneous use of feature weighting and instance clustering increases the adaptation and classification performance. Our approach automatically discovers the invariant features across domains and employs them to bridge between source and target domains. We demonstrate the effectiveness of our approach in the task of artificial and real world dataset examinations. Empirical results show that the proposed method outperforms other state-of-the-art methods on the standard transfer learning benchmark datasets.

IEEE Intelligent Systems, 2017

Prediction markets are well-established tools for aggregating information from diverse sources in... more Prediction markets are well-established tools for aggregating information from diverse sources into accurate forecasts. Their success has been demonstrated in a wide range applications, including presidential campaigns, sporting events and economic outcomes. Recently, they have been introduced to the machine-learning community in the form of Artificial Prediction Markets, whereby algorithms trade contracts reflecting their levels of confidence. To date, those markets have mostly been studied in the context of offline classification, with quite promising results. We extend those markets to enable their use in online regression, and introduce: (i) adaptive trading strategies informed by individual trading history; and (ii) the ability of participants to revise their predictions by reflecting upon the wisdom of the crowd, which is manifested in the collective performance of the market. We empirically evaluate our model using multiple UCI data sets, and show that it outperforms several well-established techniques from the literature on online regression.

Journal of Machine Intelligence, 2016

Classifier chains method is introduced recently in multi-label classification scope as a high pre... more Classifier chains method is introduced recently in multi-label classification scope as a high predictive performance technique aims to exploit label dependencies and, in the meantime, preserving the computational complexity in a desirable level. In this paper, we present a method for chain's order, called Ordered Classifier Chains (OCC), elaborating that the sequence of labels in the chain plays an important role in predictive performance of corresponding multi-label classifiers. OCC proposes making use of correlation of every class label with that of features. OCC renders an ordering of class labels in their descending order. Once the ordering of labels is determined, the features along with every label are fed to binary classifier. In the classifier chain model, the feature space of every binary classifier is extended with the new order of labels. In order to specify association of each sample with the set of class labels, it is given to all of classifiers. Empirical evaluations include an extensive range of multi-label datasets reveal that OCC manages to improve the classification performance compared to existing approaches.

International Journal of Intelligent Systems and Applications, 2016

Recent research have depicted that hidden Markov model (HMM) is a persuasive option for malware d... more Recent research have depicted that hidden Markov model (HMM) is a persuasive option for malware detection. However, some advanced metamorphic malware are able to overcome the traditional methods based on HMMs. This proposed approach provides a two-layer technique to overcome these challenges. Malware contain various sequences of opcodes some of which are more important and help detect the malware and the rest cause interference. The important sequences of opcodes are extracted by eliminating partial sequences due to the fact that partial sequences of opcodes have more similarities to benign files. In this method, the sliding window technique is used to extract the sequences. In this paper, HMMs are trained using the important sequences of opcodes that will lead to better results. In comparison to previous methods, the results demonstrate that the proposed method is more accurate in metamorphic malware detection and shows higher speed at classification.

Text summarization is an important field in the area of natural language processing and text mini... more Text summarization is an important field in the area of natural language processing and text mining. This paper proposes an extraction-based model which uses graphbased and information theoretic concepts for multidocument summarization. Our method constructs a directed weighted graph from the original text by adding a vertex for each sentence, and compute a weighted edge between sentences which is based on distortion measures. In this paper we proposed a combination of these two models by representing the input as a graph, using distortion measures as the weight function and a ranking algorithm. Finally, a ranking algorithm is applied to identify the most important sentences to be included in the summary. By defining a proper distortion measure and ranking algorithm, this model gains promising results on the DUC2002 which is a well known real world data set. The results and ROUGE-1 scores of our model is fairly close to other successful models.

Artificial Intelligence Research, 2013

Recently data stream has been extensively explored due to its emergence in large deal of applicat... more Recently data stream has been extensively explored due to its emergence in large deal of applications such as sensor networks, web click streams and network flows. Vast majority of researches in the context of data stream mining are devoted to supervise learning, whereas, in real word human practice label of data are rarely available to the learning algorithms. Hence, clustering as the most important unsupervised learning has been in the gravity of focus of quite a lot number of the researchers in data stream community. Clustering paradigms basically place the similar objects together and separate the dissimilar ones into different clusters. In this paper, we propose a Statistical framework for data Stream Clustering, which abbreviated as StatisStreamClust that makes use of two components to find clusters in data stream. The first component especially designed to detect concept change where data underlying distributions change from time to time. Upon detection of concept change by the first component, the second component is triggered to update the whole clustering model. StatisStreamClust brings great benefits to data stream clustering including no sensitivity to the number of clusters and dimensions, reasonable complexity and in the meantime desirable performance, and finally no need to determine window size a priori. To explore the advantages of our approach, quite a lot of experiments with different settings and specifications are conducted. The obtained results are very promising.

An enriched game-theoretic framework for multi-objective clustering

Applied Soft Computing, 2013

ABSTRACT The framework of multi-objective clustering can serve as a competent technique in nowada... more ABSTRACT The framework of multi-objective clustering can serve as a competent technique in nowadays human issues ranging from decision making process to machine learning and pattern recognition problems. Multi-objective clustering basically aims at placing similar objects into the same groups based on some conflicting objectives, which substantially supports the use of game theory to come to a resolution. Based on these understandings, this paper suggests Enriched Game Theory K-means, called EGTKMeans, as a novel multi-objective clustering technique based on the notion of game theory. EGTKMeans is specially designed to optimize two intrinsically conflicting objectives, named, compaction and equi-partitioning. The key contributions of the proposed approach are three folds. First, it formulates an elegant and novel payoff definition which considers both objectives with equal priority. The presented payoff function incorporates a desirable fairness into the final clustering results. Second, EGTKMeans performs better off by utilizing the advantages of mixed strategies as well as those of pure ones, considering the existence of mixed Nash Equilibrium in every game. The last but not the least is that EGTKMeans approaches the optimal solution in a very promising manner by optimizing both objectives simultaneously. The experimental results suggest that the proposed approach significantly outperforms other rival methods across real world and synthetic data sets with reasonable time complexity.

IEEE Transactions on Pattern Analysis …, 2002

AbstractIn this article, we describe an unsupervised feature selection algorithm suitable for dat... more AbstractIn this article, we describe an unsupervised feature selection algorithm suitable for data sets, large in both dimension and size. The method is based on measuring similarity between features whereby redundancy therein is removed. This does not need any search and, ...

International Conference on Knowledge Discovery and Information Retrieval, 2012

Discovering teams of experts in social networks has been receiving the increasing attentions rece... more Discovering teams of experts in social networks has been receiving the increasing attentions recently. These teams are often formed when a given specific task should be accomplished by the collaboration and the communication of the small number of connected experts and with the minimum communication cost. In this study we propose a game theoretic framework to find top-k teams satisfying such conditions. The importance of finding top-k teams is revealed when the experts of the best discovered team do not have an incentive to work together for any reason and hence we must refer to the next found teams. Finally, the local Nash equilibrium corresponding to the game is reached when all of the teams are formed. The experimental results on DBLP co-authorship graph show the effectiveness and efficiency of the proposed method.

Advances in Social …, 2012

Discovering communities in popular social networks like Facebook has been receiving significant a... more Discovering communities in popular social networks like Facebook has been receiving significant attentions recently. In this paper, inspired from real life, we have addressed the community detection problem by a framework based on Information Diffusion Model and Game Theory. In this approach, we consider each node of the social network as a selfish agent which has interactions with its neighbors and tries to maximize its total utility (i.e. received information). Finally community structure of the graph reveals after reaching to the local Nash equilibrium of the game. Experimental results on the benchmark social media datasets, synthetic and real world graphs demonstrate that our method is superior compared with the other state-of-the-art methods.

AI COMMUNICATION JOURNAL, 2013

Identifying communities in social networks has been receiving the increasing attentions recently.... more Identifying communities in social networks has been receiving the increasing attentions recently. However, the overlapping concept has received little attentions in the literature, although it is observed in almost all social networks. In this study, we propose a framework based on the game theory and the structural equivalence concept to address the detection of overlapping communities in social networks. Specifically, we consider the underlying graph as a hypothetical social networking website and regard each vertex of this graph as an agent performing in this multiagent environment. Since each agent may belong to several communities simultaneously, we are able to find overlapping community structure of social networks. The rigorous proof of the existence of Nash equilibrium in this game is provided which shows that the method always reaches to the final solution. Experimental results on the benchmark and real world graphs show superiority of our approach over the other state-of-the-art methods.

Today’s security threats like malware are more sophisticated and targeted than ever, and they are... more Today’s security threats like malware are more sophisticated and targeted than ever, and they are growing at an unprecedented rate. To deal with them, various approaches are introduced. One of them is Signature-based detection, which is an effective method and widely used to detect malware; however, there is a substantial problem in detecting new instances. In other words, it is solely useful for the second malware attack. Due to the rapid proliferation of malware and the desperate need for human effort to extract some kinds of signature, this approach is a tedious solution; thus, an intelligent malware detection system is required to deal with new malware threats. Most of intelligent detection systems utilise some data mining techniques in order to distinguish malware from sane programs. One of the pivotal phases of these systems is extracting features from malware samples and benign ones in order to make at least a learning model. This phase is called “Malware Analysis” which play...

I-IncLOF: Improved Incremental Local Outlier Detection for Data Streams

Data streams outlier mining is an important and active research issue in anomaly detection. Most ... more Data streams outlier mining is an important and active research issue in anomaly detection. Most of existing methods are more suitable for static data, since the algorithm has all data available at time of detection. But as data streams evolve during the time, traditional methods cannot perform well on them, therefore; evaluating objects as outlier when it arrives, although meaningful, often can lead us to a wrong decision, because of dynamic nature of the data stream. In this paper an Improved Incremental LOF algorithm is ...