Jin-Hee Cho - Profile on Academia.edu (original) (raw)

Papers by Jin-Hee Cho

arXiv (Cornell University), Feb 18, 2023

Due to various and serious adverse impacts of spreading fake news, it is often known that only pe... more Due to various and serious adverse impacts of spreading fake news, it is often known that only people with malicious intent would propagate fake news. However, it is not necessarily true based on social science studies. Distinguishing the types of fake news spreaders based on their intent is critical because it will effectively guide how to intervene to mitigate the spread of fake news with different approaches. To this end, we propose an intent classification framework that can best identify the correct intent of fake news. We will leverage deep reinforcement learning (DRL) that can optimize the structural representation of each tweet by removing noisy words from the input sequence when appending an actor to the long short-term memory (LSTM) intent classifier. Policy gradient DRL model (e.g., REINFORCE) can lead the actor to a higher delayed reward. We also devise a new uncertaintyaware immediate reward using a subjective opinion that can explicitly deal with multidimensional uncertainty for effective decision-making. Via 600K training episodes from a fake news tweets dataset with an annotated intent class, we evaluate the performance of uncertainty-aware reward in DRL. Evaluation results demonstrate that our proposed framework efficiently reduces the number of selected words to maintain a high 95% multi-class accuracy.

A Survey on Uncertainty Reasoning and Quantification in Belief Theory and its Application to Deep Learning

IEEE Access

The moving target defense (MTD) is a proactive cybersecurity defense technique that constantly ch... more The moving target defense (MTD) is a proactive cybersecurity defense technique that constantly changes potentially vulnerable points to be attacked, to confuse the attackers, making it difficult for attackers to infer the system configuration and nullify reconnaissance activities to a victim system. We consider an MTD strategy for software-defined networking (SDN) environment where every SDN switch is controlled by a central SDN controller. As the MTD may incur excessive usage of the network/system resources for cybersecurity purposes, we propose to perform the MTD operations adaptively according to the security risk assessment based on a Bayesian attack graph (BAG) analysis. For accurate BAG analysis, we model random and weakest-first attack behaviors and incorporate the derived analytical models into the BAG analysis. Using the BAG analysis result, we formulate a knapsack problem to determine the optimal set of vulnerabilities to be reconfigured under a constraint of SDN reconfiguration overhead. The experiment results prove that the proposed MTD strategy outperforms the full MTD and random MTD counterparts in terms of the maximum/average of attack success probabilities and the number of SDN reconfiguration updates. INDEX TERMS Moving target defense, Bayesian attack graph, software-defined networking.

ACM Transactions on Cyber-Physical Systems

Machine learning (ML)-based intrusion detection system (IDS) approaches have been significantly a... more Machine learning (ML)-based intrusion detection system (IDS) approaches have been significantly applied and advanced the state-of-the-art system security and defense mechanisms. In smart grid computing environments, security threats have been significantly increased as shared networks are commonly used, along with the associated vulnerabilities. However, compared to other network environments, ML-based IDS research in a smart grid is relatively unexplored, although the smart grid environment is facing serious security threats due to its unique environmental vulnerabilities. In this article, we conducted an extensive survey on ML-based IDS in smart grids based on the following key aspects: (1) The applications of the ML-based IDS in transmission and distribution side power components of a smart power grid by addressing its security vulnerabilities; (2) dataset generation process and its usage in applying ML-based IDSs in the smart grid; (3) a wide range of ML-based IDSs used by the s...

2019 22th International Conference on Information Fusion (FUSION)

This work proposes an opinion inference algorithm in large graph network data using subjective, u... more This work proposes an opinion inference algorithm in large graph network data using subjective, uncertain opinions. In the graph network data, an opinion is associated with an edge between two nodes where the edge indicates a known opinion while no edge refers to an unknown opinion for their relationship. The examples include the predictions of a road traffic condition (i.e., an edge indicates a road between two intersections and an opinion represents congested or noncongested) or trust relationships (i.e., an edge refers to a trust relationship between two users where an opinion indicates a user's trust in another user). To derive an unknown opinion between two nodes, we identify a set of best paths in the graph network data that can maximize decision performance (e.g., prediction accuracy). To solve this problem, we formulate each opinion using Subjective Logic (SL) and leverage a policy-based deep reinforcement learning (DRL) technique. We propose three DRL-based schemes combining SL and DRL where a reward is given based on a different type of uncertainty, including vacuity, dissonance, or monosonance. Via extensive simulation experiments, we investigate what type of uncertainty is a more critical factor than others in improving decision performance when a different uncertainty type is considered as a reward in DRL. We validated the outperformance of the proposed DRLbased schemes in terms of belief errors, prediction accuracy, and computation time based on both a semi-synthetic and real world datasets.

2021 IEEE Global Communications Conference (GLOBECOM)

Proposed a game theoretic opinion framework to handle disinformation with various types of opinio... more Proposed a game theoretic opinion framework to handle disinformation with various types of opinion models. Formulated a user's uncertain opinion by a belief model, called Subjective Logic (SL), to analyze the dynamics of subjective and uncertain opinions. Designed attackers' various deception tactics of disinformation propagation by SL. Demonstrated optimal strategy choices by players along with their underlying reasons. Investigated how each opinion model contributes to combating disinformation propagation.

arXiv (Cornell University), Nov 19, 2021

The deployment of monoculture software stacks can cause a devastating damage even by a single exp... more The deployment of monoculture software stacks can cause a devastating damage even by a single exploit against a single vulnerability. Inspired by the resilience benefit of biological diversity, the concept of software diversity has been proposed in the security domain. Although it is intuitive that software diversity may enhance security, its effectiveness has not been quantitatively investigated. Currently, no theoretical or empirical study has been explored to measure the security effectiveness of network diversity. In this paper, we take a first step towards ultimately tackling the problem. We propose a systematic framework that can model and quantify the security effectiveness of network diversity. We conduct simulations to demonstrate the usefulness of the framework. In contrast to the intuitive belief, we show that diversity does not necessarily improve security from a whole-network perspective. The root cause of this phenomenon is that the degree of vulnerability in diversified software implementations plays a critical role in determining the security effectiveness of software diversity.

arXiv (Cornell University), Jan 21, 2021

Defensive deception is a promising approach for cyber defense. Via defensive deception, the defen... more Defensive deception is a promising approach for cyber defense. Via defensive deception, the defender can anticipate attacker actions; it can mislead or lure attacker, or hide real resources. Although defensive deception is increasingly popular in the research community, there has not been a systematic investigation of its key components, the underlying principles, and its tradeoffs in various problem settings. This survey paper focuses on defensive deception research centered on game theory and machine learning, since these are prominent families of artificial intelligence approaches that are widely employed in defensive deception. This paper brings forth insights, lessons, and limitations from prior work. It closes with an outline of some research directions to tackle major gaps in current defensive deception research.

2018 IEEE International Conference on Big Data (Big Data), 2018

Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain ... more Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain opinions and infer unknown opinions based on a rich set of operators of fusing multiple opinions. Due to high simplicity and applicability, SL has been substantially applied in a variety of decision making in the area of cybersecurity, opinion models, trust models, and/or social network analysis. However, SL and its variants have exposed limitations in predicting uncertain opinions in real-world dynamic network data mainly in threefold: (1) a lack of scalability to deal with a large-scale network; (2) limited capability to handle heterogeneous topological and temporal dependencies among nodelevel opinions; and (3) a high sensitivity with conflicting evidence that may generate counterintuitive opinions derived from the evidence. In this work, we proposed a novel deep learning (DL)based dynamic opinion inference model while node-level opinions are still formalized based on SL meaning that an opinion has a dimension of uncertainty in addition to belief and disbelief in a binomial opinion (i.e., agree or disagree). The proposed DLbased dynamic opinion inference model overcomes the above three limitations by integrating the following techniques: (1) state-of-the-art DL techniques, such as the Graph Convolutional Network (GCN) and the Gated Recurrent Units (GRU) for modeling the topological and temporal heterogeneous dependency information of a given dynamic network; (2) modeling conflicting opinions based on robust statistics; and (3) a highly scalable inference algorithm to predict dynamic, uncertain opinions in a linear computation time. We validated the outperformance of our proposed DL-based algorithm (i.e., GCN-GRU-opinion model) via extensive comparative performance analysis based on four real-world datasets.

2018 IEEE International Conference on Data Mining (ICDM), 2018

Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain ... more Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain opinions and infer unknown opinions based on a rich set of operators of fusing multiple opinions. Due to high simplicity and applicability, SL has been popularly applied in a variety of decision making in the area of cybersecurity, opinion models, and/or trust or social network analysis. However, SL has an issue of scalability to deal with a large-scale network data. In addition, SL has shown a bounded prediction accuracy due to its inherent parametric nature by treating heterogeneous data and network structure homogeneously based on the assumption of a Bayesian network. In this work, we take one step further to deal with uncertain opinions for unknown opinion inference. We propose a deep learning (DL)-based opinion inference model while node-level opinions are still formalized based on SL. The proposed DL-based opinion inference model handles node-level opinions explicitly in a large-scale network using graph convoluational network (GCN) and variational autoencoder (VAE) techniques. We adopted the GCN and VAE due to their powerful learning capabilities in dealing with a large-scale network data without parametric fusion operators and/or Bayesian network assumption. This work is the first that leverages the merits of both DL (i.e., GCN and VAE) and a belief model (i.e., SL) where each node level opinion is modeled by the formalism of SL while GCN and VAE are used to achieve non-parametric learning with low complexity. By mapping the node-level opinions modeled by the GCN to their equivalent Beta PDFs (probability density functions), we develop a network-driven VAE to maximize prediction accuracy of unknown opinions while significantly reducing algorithmic complexity. We validate our proposed DL-based algorithm using real-world datasets via extensive simulation experiments for comparative performance analysis.

MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM), 2018

In recent years, belief models, such as subjective logic (SL) and collective subjective logic (CS... more In recent years, belief models, such as subjective logic (SL) and collective subjective logic (CSL), have been developed to model an opinion consisting of belief, disbelief, and uncertainty. However, these belief models are designed based on either predefined operators (e.g., discounting and consensus operators) or distribution assumptions (e.g., Markov random fields or MRFs) that are incapable of capturing the heterogeneity of the uncertainty information in large-scale network data. In this paper, we propose a general framework to model and infer heterogeneous uncertainty information in network data based on the state-of-the-art graph convolutional neural networks (GCN). This work is the first that employs a GCN to model the heterogeneous probability density function (PDF) of nodelevel variables. And then we project this PDF function into a subspace of PDF functions defined based on node-level opinions via knowledge distillation, which provides an effective prediction of the unknown opinion of some nodes based on the observed opinions of the other nodes. Through the extensive simulation experiments, we show that our proposed approach performs better than SL and CSL in predicting unknown opinions when using two road traffic datasets for the validation of the tested algorithms.

2019 IEEE International Conference on Big Data (Big Data), 2019

Inference of unknown opinions with uncertain, adversarial (e.g., incorrect or conflicting) eviden... more Inference of unknown opinions with uncertain, adversarial (e.g., incorrect or conflicting) evidence in large datasets is not a trivial task. Without proper handling, it can easily mislead decision making in data mining tasks. In this work, we propose a highly scalable opinion inference probabilistic model, namely Adversarial Collective Opinion Inference (Adv-COI), which provides a solution to infer unknown opinions with high scalability and robustness under the presence of uncertain, adversarial evidence by enhancing Collective Subjective Logic (CSL) which is developed by combining SL and Probabilistic Soft Logic (PSL). The key idea behind the Adv-COI is to learn a model of robust ways against uncertain, adversarial evidence which is formulated as a min-max problem. We validate the outperformance of the Adv-COI compared to baseline models and its competitive counterparts under possible adversarial attacks on the logic-rule based structured data and white and black box adversarial attacks under both clean and perturbed semisynthetic and real-world datasets in three real world applications. The results show that the Adv-COI generates the lowest mean absolute error in the expected truth probability while producing the lowest running time among all.

ArXiv, 2019

Traditional deep neural nets (NNs) have shown the state-of-the-art performance in the task of cla... more Traditional deep neural nets (NNs) have shown the state-of-the-art performance in the task of classification in various applications. However, NNs have not considered any types of uncertainty associated with the class probabilities to minimize risk due to misclassification under uncertainty in real life. Unlike Bayesian neural nets indirectly infering uncertainty through weight uncertainties, evidential neural networks (ENNs) have been recently proposed to support explicit modeling of the uncertainty of class probabilities. It treats predictions of an NN as subjective opinions and learns the function by collecting the evidence leading to these opinions by a deterministic NN from data. However, an ENN is trained as a black box without explicitly considering different types of inherent data uncertainty, such as vacuity (uncertainty due to a lack of evidence) or dissonance (uncertainty due to conflicting evidence). This paper presents a new approach, called a {\em regularized ENN}, tha...

ArXiv, 2021

Defensive deception techniques have emerged as a promising proactive defense mechanism to mislead... more Defensive deception techniques have emerged as a promising proactive defense mechanism to mislead an attacker and thereby achieve attack failure. However, most game-theoretic defensive deception approaches have assumed that players maintain consistent views under uncertainty. They do not consider players’ possible, subjective beliefs formed due to asymmetric information given to them. In this work, we formulate a hypergame between an attacker and a defender where they can interpret the same game differently and accordingly choose their best strategy based on their respective beliefs. This gives a chance for defensive deception strategies to manipulate an attacker’s belief, which is the key to the attacker’s decision making. We consider advanced persistent threat (APT) attacks, which perform multiple attacks in the stages of the cyber kill chain where both the attacker and the defender aim to select optimal strategies based on their beliefs. Through extensive simulation experiments, ...

ArXiv, 2020

Thanks to graph neural networks (GNNs), semi-supervised node classification has shown the state-o... more Thanks to graph neural networks (GNNs), semi-supervised node classification has shown the state-of-the-art performance in graph data. However, GNNs have not considered different types of uncertainties associated with class probabilities to minimize risk of increasing misclassification under uncertainty in real life. In this work, we propose a multi-source uncertainty framework using a GNN that reflects various types of predictive uncertainties in both deep learning and belief/evidence theory domains for node classification predictions. By collecting evidence from the given labels of training nodes, the Graph-based Kernel Dirichlet distribution Estimation (GKDE) method is designed for accurately predicting node-level Dirichlet distributions and detecting out-of-distribution (OOD) nodes. We validated the outperformance of our proposed model compared to the state-of-the-art counterparts in terms of misclassification detection and OOD detection based on six real network datasets. We fou...

2021 IEEE International Conference on Web Services (ICWS), 2021

Firstly, I would like to express my sincere gratitude to my advisor, Dr. Jin-Hee Cho, for her gui... more Firstly, I would like to express my sincere gratitude to my advisor, Dr. Jin-Hee Cho, for her guidance throughout this research. She has been really helpful all the way, guiding me patiently through each and every problem I faced, and nudging me in the right direction. I could not have made the progress I made, without her support and guidance, and I have nothing but exceptionally good things to say about her. I would also like to extend my thanks to Dr. Chang-Tien Lu for his time and encouragement for going through the crucial part of my thesis and providing some valuable inputs. I express my honest appreciation to Dr. Terrence J. Moore for his dedication and important comments for the revision of this document as well as inputs for future work directions. Finally, I would like to thank my family and friends for all the moral support that they have provided over the course of this research.

IEEE Access, 2021

Centrality metrics have been studied in the network science research. They have been used in vari... more Centrality metrics have been studied in the network science research. They have been used in various networks, such as communication, social, biological, geographic, or contact networks under different disciplines. In particular, centrality metrics have been used in order to study and analyze targeted attack behaviors and investigated their effect on network resilience. Although a rich volume of centrality metrics has been developed from 1940s, only some centrality metrics (e.g., degree, betweenness, or cluster coefficient) have been commonly in use. This paper aims to introduce various existing centrality metrics and discusses their applicabilities in various networks. In addition, we conducted extensive simulation study in order to demonstrate and analyze the network resilience of targeted attacks using the surveyed centrality metrics under four real network topologies. We also discussed algorithmic complexity of centrality metrics surveyed in this work. Through the extensive experiments and discussions of the surveyed centrality metrics, we encourage their use in solving various computing and engineering problems in networks.

ACM Transactions on Internet Technology, 2022

Resource constrained Internet-of-Things (IoT) devices are highly likely to be compromised by atta... more Resource constrained Internet-of-Things (IoT) devices are highly likely to be compromised by attackers, because strong security protections may not be suitable to be deployed. This requires an alternative approach to protect vulnerable components in IoT networks. In this article, we propose an integrated defense technique to achieve intrusion prevention by leveraging cyberdeception (i.e., a decoy system) and moving target defense (i.e., network topology shuffling). We evaluate the effectiveness and efficiency of our proposed technique analytically based on a graphical security model in a software-defined networking (SDN)-based IoT network. We develop four strategies (i.e., fixed/random and adaptive/hybrid) to address “when” to perform network topology shuffling and three strategies (i.e., genetic algorithm/decoy attack path-based optimization/random) to address “how” to perform network topology shuffling on a decoy-populated IoT network, and we analyze which strategy can best achiev...

Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020

Moving target defense (MTD) has been developed as an emerging technology to enhance system/networ... more Moving target defense (MTD) has been developed as an emerging technology to enhance system/network security by randomly and continuously changing attack surface. Despite the significant progress of recent efforts in analyzing the security effectiveness of MTD mechanisms, critical gaps still exist in terms of the impact of running MTD mechanisms on system performance and dependability, exposing a critical design tradeoff between security and performance. To investigate the tradeoff, we propose performability models for evaluating services hosted in software-defined networks with a time-based MTD mechanism being deployed. We developed analytical models for evaluating key performability metrics, in terms of response time, throughput, availability, host utilization, a number of requests lost, and cost (i.e., energy consumption plus profits lost due to dropped jobs). Our results showed that using the time-based MTD mechanism can (1) improve service response time and host utilization; (2) introduce a higher number of requests lost and higher overall cost; and (3) reduce service availability while still handling most of the jobs without much performance degradation. CCS CONCEPTS • Networks → Network performance modeling; • Computing methodologies → Model development and analysis; • Security and privacy → Network security;

IEEE Access, 2021

The recent development of autonomous driving technologies has led to the proliferation of researc... more The recent development of autonomous driving technologies has led to the proliferation of research on sensors and electronic equipment inside a vehicle. To deal with security concerns of in-vehicle networks, various deep learning (DL) and reinforcement learning (RL) have been developed to enhance in-vehicle security. However, the DL/RL agents are vulnerable to adversarial perturbation, where an attacker can perform a manipulation attack to interfere with the agent's operation. In this work, we aim to develop two key mechanisms to build secure in-vehicle networks: (1) RL-based proactive defense mechanism to achieve multiple objectives of minimizing system security vulnerabilities while maximizing service availability; and (2) a resilient RL method that allows an agent to operate in the presence of adversarial disturbances that neutralize the system security. To this end, we propose, DESOLATER (Drl-based rESOurce aLlocation And mTd dEployment fRamework), which is a multi-agent deep reinforcement learning (mDRL)-based network slicing technique that can help determine two key network management decisions: (1) link bandwidth allocation to meet quality-of-service (QoS) requirements; and (2) the frequency of triggering IP shuffling as a proactive defense mechanism not to hinder service availability by maintaining normal system operations. We also introduce an anomaly detection mechanism with a memory-based RL technique to enhance the resiliency of the RL agents in a partially observable environment under the situation that adversarial attackers manipulating observation information. Through extensive simulation experiments, we validate that the proposed robust mDRL algorithm can help the deployed proactive security mechanism achieve both security and network performance improvement in the presence of adversarial attacks.

arXiv (Cornell University), Feb 18, 2023

A Survey on Uncertainty Reasoning and Quantification in Belief Theory and its Application to Deep Learning

IEEE Access

ACM Transactions on Cyber-Physical Systems

2019 22th International Conference on Information Fusion (FUSION)

2021 IEEE Global Communications Conference (GLOBECOM)

arXiv (Cornell University), Nov 19, 2021

arXiv (Cornell University), Jan 21, 2021

2018 IEEE International Conference on Big Data (Big Data), 2018

2018 IEEE International Conference on Data Mining (ICDM), 2018

Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain ... more Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain opinions and infer unknown opinions based on a rich set of operators of fusing multiple opinions. Due to high simplicity and applicability, SL has been popularly applied in a variety of decision making in the area of cybersecurity, opinion models, and/or trust or social network analysis. However, SL has an issue of scalability to deal with a large-scale network data. In addition, SL has shown a bounded prediction accuracy due to its inherent parametric nature by treating heterogeneous data and network structure homogeneously based on the assumption of a Bayesian network. In this work, we take one step further to deal with uncertain opinions for unknown opinion inference. We propose a deep learning (DL)-based opinion inference model while node-level opinions are still formalized based on SL. The proposed DL-based opinion inference model handles node-level opinions explicitly in a large-scale network using graph convoluational network (GCN) and variational autoencoder (VAE) techniques. We adopted the GCN and VAE due to their powerful learning capabilities in dealing with a large-scale network data without parametric fusion operators and/or Bayesian network assumption. This work is the first that leverages the merits of both DL (i.e., GCN and VAE) and a belief model (i.e., SL) where each node level opinion is modeled by the formalism of SL while GCN and VAE are used to achieve non-parametric learning with low complexity. By mapping the node-level opinions modeled by the GCN to their equivalent Beta PDFs (probability density functions), we develop a network-driven VAE to maximize prediction accuracy of unknown opinions while significantly reducing algorithmic complexity. We validate our proposed DL-based algorithm using real-world datasets via extensive simulation experiments for comparative performance analysis.

MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM), 2018

2019 IEEE International Conference on Big Data (Big Data), 2019

ArXiv, 2019

ArXiv, 2021

ArXiv, 2020

2021 IEEE International Conference on Web Services (ICWS), 2021

IEEE Access, 2021

ACM Transactions on Internet Technology, 2022

Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020

IEEE Access, 2021