Zhaoyan Xu - Academia.edu (original) (raw)

Papers by Zhaoyan Xu

Proceedings of the 2018 Workshop on Security in Softwarized Networks: Prospects and Challenges, 2018

Software-defined networking (SDN) and Network Function Virtualization (NFV) have inspired securit... more Software-defined networking (SDN) and Network Function Virtualization (NFV) have inspired security researchers to devise new security applications for these new network technology. However, since SDN and NFV are basically faithful to operating a network, they only focus on providing features related to network control. Therefore, it is challenging to implement complex security functions such as packet payload inspection. Several studies have addressed this challenge through an SDN data plane extension, but there were problems with performance and control interfaces. In this paper, we introduce a new data plane architecture, HEX which leverages existing data plane architectures for SDN to enable network security applications in an SDN environment efficiently and effectively. HEX provides security services as a set of OpenFlow actions ensuring high performance and a function of handling multiple SDN actions with a simple control command. We implemented a DoS detector and Deep Packet Inspection (DPI) as the prototype features of HEX using the NetFPGA-1G-CML, and our evaluation results demonstrate that HEX can provide security services as a line-rate performance.

2018 27th International Conference on Computer Communication and Networks (ICCCN), 2018

Some fundamental reasons why our networked systems are still vulnerable to network attacks are be... more Some fundamental reasons why our networked systems are still vulnerable to network attacks are because (1) they are more open than necessary; (2) they are homogeneous, i.e., the same way to exploit a vulnerability on one machine is easily applicable to many other machines (which is particularly a severe issue in cloud computing environments when virtual machines images are heavily reused/cloned); (3) current networked services are merely static targets, i.e., they are easily predictable and do not change. While network authentication and access control mechanisms such as firewall and VPN can help reduce the openness (mostly at network perimeter level), they do not help much on the latter two factors. To bridge the gap and greatly complement existing network authentication/access control mechanisms, we propose CloudRand, a new framework to make networked systems/services in the cloud heterogeneous (every host has a different networking interface) and moving targets (such interfaces keep changing and they are unpredictable to untrusted entities). Inspired by the previous work on host-level (memory or instruction) Address Space Randomization (ASR), we build a lightweight solution to randomize network service interfaces. Thus, even derived from the same image, each virtual machine can have very different network service interfaces and they keep changing to further reduce the attack surface. CloudRand is an application-independent security service, orthogonal to existing application/network security mechanisms such as authentication, encryption, and access control. To fit into different environments such as clouds or enterprise networks, we provide various prototype systems at different levels for flexible deployment choices, e.g., host level (kernel drivers for both Linux and Windows), network level (based on Click modular router or software-defined networking technology), virtual machine hypervisor level (based on Xen), and application level (browser plugin). Our extensive evaluation shows that this solution has low overhead, and it can it can significantly reduce the network attack surface and successfully defeat malware epidemic attacks.

In recent years, the emerging Internet-of-Things (IoT) has led to concerns about the security of ... more In recent years, the emerging Internet-of-Things (IoT) has led to concerns about the security of networked embedded devices. There is a strong need to develop suitable and costefficient methods to find vulnerabilities in IoT devices in order to address them before attackers take advantage of them. In traditional IT security, honeypots are commonly used to understand the dynamic threat landscape without exposing critical assets. In previous BlackHat conferences, conventional honeypot technology has been discussed multiple times. In this work, we focus on the adaptation of honeypots for improving the security of IoTs, and argue why we need to have a huge innovation to build honeypot for IoT devices. Due to the heterogeneity of IoT devices, manually crafting the low-interaction honeypot is not affordable; on the other hand, purchasing all of physical IoT devices to build highinteraction honeypot is not affordable. This dilemma forced us to seek an innovative way to build honeypot for I...

Malware often contains many system-resourcesensitive condition checks to avoid any duplicate infe... more Malware often contains many system-resourcesensitive condition checks to avoid any duplicate infection, make sure to obtain required resources, or try to infect only targeted computers, etc. If we are able to extract the system resource constraints from malware code, and manipulate the environment state as vaccines, we would then be able to immunize a computer from infections. Towards this end, this paper provides the first systematic study and presents a prototype system, AUTOVAC, for automatically extracting the system resource constraints from malware code and generating vaccines based on the system resource conditions. Specifically, through monitoring the data propagation from system-resource-related system calls, AUTOVAC automatically identifies the environment related state of a computer. Through analyzing the environment state, AUTOVAC automatically generates vaccines. Such vaccines can be then injected into other computers, thereby being immune from future infections from th...

Detection of Intrusions and Malware, and Vulnerability Assessment, 2019

SDN-based NFV technologies improve the dependability and resilience of networks by enabling admin... more SDN-based NFV technologies improve the dependability and resilience of networks by enabling administrators to spawn and scale-up tra c management and security services in response to dynamic network conditions. However, in practice, SDN-based NFV services often su er from poor performance and require complex con gurations due to the fact that network packets must be 'detoured' to each virtualized security service, which expends bandwidth and increases network propagation delay. To address these challenges, we propose a new SDN-based data plane architecture called DPX that natively supports security services as a set of abstract security actions that are then translated to Open-Flow rule sets. The DPX action model reduces redundant processing caused by frequent packet parsing and provides administrators a simpli ed (and less errorprone) method for con guring security services into the network. DPX also increases the e ciency of enforcing complex security policies by introducing a novel technique called action clustering, which aggregates security actions from multiple ows into a small number of synthetic rules. We present an implementation of DPX in hardware using NetFPGA-SUME and in software using Open vSwitch. We evaluated the performance of the DPX prototype and the e cacy of its ow-table simpli cations against a range of complex network policies exposed to line rates of 10 Gbps. We nd that DPX imposes minimal overheads in terms of latency (≈0.65 ms in hardware and ≈1.2 ms in software on average) and throughput (≈1% of simple forwarding in hardware and ≈10% in software for non-DPI security services). This translates to an improvement of 30% over traditional NFV services on the software implementation and 40% in hardware.

Journal of Computer and Communications, 2018

In Industrial Control Systems (ICS), security issues are getting more and more attention. The num... more In Industrial Control Systems (ICS), security issues are getting more and more attention. The number of hacking attacks per year is endless, and the attacks on industrial control systems are numerous. Programmable Logic Controller (PLC) is one of the main controllers of industrial processes. Since the industrial control system network is isolated from the external network, many people think that PLC is a safety device. However, virus attacks in recent years, such as Stuxnet, have confirmed the erroneousness of this idea. In this paper, we use the vulnerability of Siemens PLC to carry out a series of attacks, such as S7-200, S7-300, S7-400, S7-1200 and so on. We read the data from the PLC output and then rewrite the data and write it to the PLC. We tamper with the writing of data to achieve communication chaos. When we attack the primary station, all slave devices connected to the primary station will be in a state of communication confusion. The attack methods of us can cause delay or even loss of data in the communications from the Phasor Data Concentrator (PMU) to the data concentrator. The most important thing is that our attack method generates small traffic and short attack time, which is difficult to be identified by traditional detection methods.

Computers & Security, 2016

As interest in wireless mesh networks grows, security challenges, e.g., intrusion detection, beco... more As interest in wireless mesh networks grows, security challenges, e.g., intrusion detection, become of paramount importance. Traditional solutions for intrusion detection assign full IDS responsibilities to a few selected nodes. Recent results, however, have shown that a mesh router cannot reliably perform full IDS functions because of limited resources (i.e., processing power and memory). Cooperative IDS solutions, targeting resource constrained wireless networks impose high communication overhead and detection latency. To address these challenges, we propose PRIDE (PRactical Intrusion DEtection system for resource constrained wireless mesh networks), a non-cooperative real-time intrusion detection scheme that optimally distributes IDS functions to nodes along traffic paths, such that intrusion detection rate is maximized, while resource consumption is below a given threshold. We formulate the optimal IDS function distribution as an integer linear program and propose algorithms for solving it effectively and fast (i.e., practical). We evaluate the performance of our proposed solution in a real-world, department-wide, mesh network. An earlier version of this article appeared in ICICS 2013 [1] and the current article is significantly extended with new technical contents.

Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, 2014

Malware continues to be one of the major threats to Internet security. In the battle against cybe... more Malware continues to be one of the major threats to Internet security. In the battle against cybercriminals, accurately identifying the underlying malicious server infrastructure (e.g., C&C servers for botnet command and control) is of vital importance. Most existing passive monitoring approaches cannot keep up with the highly dynamic, ever-evolving malware server infrastructure. As an effective complementary technique, active probing has recently attracted attention due to its high accuracy, efficiency, and scalability (even to the Internet level). In this paper, we propose AUTOPROBE, a novel system to automatically generate effective and efficient fingerprints of remote malicious servers. AUTOPROBE addresses two fundamental limitations of existing active probing approaches: it supports pull-based C&C protocols, used by the majority of malware, and it generates fingerprints even in the common case when C&C servers are not alive during fingerprint generation. Using real-world malware samples we show that AUTOPROBE can successfully generate accurate C&C server fingerprints through novel applications of dynamic binary analysis techniques. By conducting Internet-scale active probing, we show that AUTOPROBE can successfully uncover hundreds of malicious servers on the Internet, many of them unknown to existing blacklists. We believe AUTOPROBE is a great complement to existing defenses, and can play a unique role in the battle against cybercriminals.

2013 IEEE 33rd International Conference on Distributed Computing Systems, 2013

Computer Security - ESORICS 2014, 2014

Advanced false data injection attack in targeted malware intrusion is becoming an emerging severe... more Advanced false data injection attack in targeted malware intrusion is becoming an emerging severe threat to the Supervisory Control And Data Acquisition (SCADA) system. Several intrusion detection schemes have been proposed previously [1, 2]. However, designing an effective real-time detection system for a resource-constraint device is still an open problem for the research community. In this paper, we propose a new relation-graph-based detection scheme to defeat false data injection attacks at the SCADA system, even when injected data may seemly fall within a valid/normal range. To balance effectiveness and efficiency, we design a novel detection model, alternation vectors with state relation graph. Furthermore, we propose a new inference algorithm to infer the injection point(s), i.e., the attack origin, in the system. We evaluate SRID with a real-world power plant simulator. The experiment results show that SRID can detect various false data injection attacks with a low false positive rate at 0.0125%. Meanwhile, SRID can dramatically reduce the search space of attack origins and accurately locate most of attack origins.

Lecture Notes in Computer Science, 2012

Through injecting dynamic script codes into compromised websites, attackers have widely launched ... more Through injecting dynamic script codes into compromised websites, attackers have widely launched search poisoning attacks to achieve their malicious goals, such as spreading spam or scams, distributing malware and launching drive-by download attacks. While most current related work focuses on measuring or detecting specific search poisoning attacks in the crawled dataset, it is also meaningful to design an effective approach to find more compromised websites on the Internet that have been utilized by attackers to launch search poisoning attacks, because those compromised websites essentially become an important component in the search poisoning attack chain. In this paper, we present an active and efficient approach, named PoisonAmplifier, to find compromised websites through tracking down search poisoning attacks. Particularly, starting from a small seed set of known compromised websites that are utilized to launch search poisoning attacks, PoisonAmplifier can recursively find more compromised websites by analyzing poisoned webpages' special terms and links, and exploring compromised web sites' vulnerabilities. Through our 1 month evaluation, PoisonAmplifier can quickly collect around 75K unique compromised websites by starting from 252 verified compromised websites within first 7 days and continue to find 827 new compromised websites on a daily basis thereafter.

Proceedings of the 2012 ACM conference on Computer and communications security, 2012

Inspired by the biological vaccines, we explore the possibility of developing similar vaccines fo... more Inspired by the biological vaccines, we explore the possibility of developing similar vaccines for malware immunization. We provide the first systematic study towards this direction and present a prototype system, AGAMI, for automatic generation of vaccines for malware immunization. With a novel use of several dynamic malware analysis techniques, we show that it is possible to extract a lightweight vaccine from current malware, and after injecting such vaccine on clean machines, they can be immune from future infection from the same malware family. We evaluate AGAMI on a large set of real-world malware samples and successfully extract working vaccines for many families such as Conficker and Zeus. We believe it is an appealing complementary technique to existing malware defense solutions.

2012 Proceedings IEEE INFOCOM, 2012

Although a lot of approaches have been proposed to detect bots at host or network level, they sti... more Although a lot of approaches have been proposed to detect bots at host or network level, they still have shortcomings. In this paper, we propose EFFORT, a new host-network cooperated detection framework attempting to overcome shortcomings of both approaches while still keeping both advantages, i.e., effectiveness and efficiency. We propose a multi-module approach to correlate information from different host-and network-level aspects and design a multi-layered architecture to efficiently coordinate modules to perform heavy monitoring only when necessary.

The persistent evolution of malware intrusion brings great challenges to current anti-malware ind... more The persistent evolution of malware intrusion brings great challenges to current anti-malware industry. First, the traditional signature-based detection and prevention schemes produce outgrown signature databases for each end-host user and user has to install the AV tool and tolerate consuming huge amount of resources for pairwise matching. At the other side of malware analysis, the emerging malware can detect its running environment and determine whether it should infect the host or not. Hence, traditional dynamic malware analysis can no longer find the desired malicious logic if the targeted environment cannot be extracted in advance. Both support, it is impossible for me to finish my research and this dissertation.

Lecture Notes in Computer Science, 2013

As interest in wireless mesh networks grows, security challenges, e.g., intrusion detection, beco... more As interest in wireless mesh networks grows, security challenges, e.g., intrusion detection, become of paramount importance. Traditional solutions for intrusion detection assign full IDS responsibilities to a few selected nodes. Recent results, however, have shown that a mesh router cannot reliably perform full IDS functions because of limited resources (i.e., processing power and memory). Cooperative IDS solutions, targeting resource constrained wireless networks impose high communication overhead and detection latency. To address these challenges, we propose PRIDE (PRactical Intrusion DEtection in resource constrained wireless mesh networks), a non-cooperative real-time intrusion detection scheme that optimally distributes IDS functions to nodes along traffic paths, such that detection rate is maximized, while resource consumption is below a given threshold. We formulate the optimal IDS function distribution as an integer linear program and propose algorithms for solving it accurately and fast (i.e., practical). We evaluate the performance of our proposed solution in a real-world, department-wide, mesh network.

Proceedings 2014 Network and Distributed System Security Symposium, 2014

Cybercriminals use different types of geographically distributed servers to run their operations ... more Cybercriminals use different types of geographically distributed servers to run their operations such as C&C servers for managing their malware, exploit servers to distribute the malware, payment servers for monetization, and redirectors for anonymity. Identifying the server infrastructure used by a cybercrime operation is fundamental for defenders, as it enables take-downs that can disrupt the operation and is a critical step towards identifying the criminals behind it. In this paper, we propose a novel active probing approach for detecting malicious servers and compromised hosts that listen for (and react to) incoming network requests. Our approach sends probes to remote hosts and examines their responses, determining whether the remote hosts are malicious or not. It identifies different malicious server types as well as malware that listens for incoming traffic such as P2P bots. Compared with existing defenses, our active probing approach is fast, cheap, easy to deploy, and achieves Internet scale. We have implemented our active probing approach in a tool called CyberProbe. We have used CyberProbe to identify 151 malicious servers and 7,881 P2P bots through 24 localized and Internet-wide scans. Of those servers 75% are unknown to publicly available databases of malicious servers, indicating that CyberProbe can achieve up to 4 times better coverage than existing techniques. Our results reveal an important provider locality property: operations hosts an average of 3.2 servers on the same hosting provider to amortize the cost of setting up a relationship with the provider. Permission to freely reproduce all or part of this paper for noncommercial purposes is granted provided that copies bear this notice and the full citation on the first page. Reproduction for commercial purposes is strictly prohibited without the prior written consent of the Internet Society, the first-named author (for reproduction of an entire paper only), and the author's employer if the paper was prepared within the scope of employment.

Computer Networks, 2013

Bots are still a serious threat to Internet security. Although a lot of approaches have been prop... more Bots are still a serious threat to Internet security. Although a lot of approaches have been proposed to detect bots at host or network level, they still have shortcomings. Hostlevel approaches can detect bots with high accuracy. However they usually pose too much overhead on the host. While network-level approaches can detect bots with less overhead, they have problems in detecting bots with encrypted, evasive communication C&C channels. In this paper, we propose EFFORT, a new host-network cooperated detection framework attempting to overcome shortcomings of both approaches while still keeping both advantages, i.e., effectiveness and efficiency. Based on intrinsic characteristics of bots, we propose a multi-module approach to correlate information from different host-and network-level aspects and design a multi-layered architecture to efficiently coordinate modules to perform heavy monitoring only when necessary. We have implemented our proposed system and evaluated on real-world benign and malicious programs running on several diverse real-life office and home machines for several days. The final results show that our system can detect all 17 real-world bots (e.g., Waledac, Storm) with low false positives (0.68%) and with minimal overhead. We believe EFFORT raises a higher bar and this host-network cooperated design represents a timely effort and a right direction in the malware battle.

Computer Security - ESORICS 2014, 2014

Most existing malicious Android app detection approaches rely on manually selected detection heur... more Most existing malicious Android app detection approaches rely on manually selected detection heuristics, features, and models. In this paper, we describe a new, complementary system, called DroidMiner, which uses static analysis to automatically mine malicious program logic from known Android malware, abstracts this logic into a sequence of threat modalities, and then seeks out these threat modality patterns in other unknown (or newly published) Android apps. We formalize a two-level behavioral graph representation used to capture Android app program logic, and design new techniques to identify and label elements of the graph that capture malicious behavioral patterns (or malicious modalities). After the automatic learning of these malicious behavioral models, DroidMiner can scan a new Android app to (i) determine whether it contains malicious modalities, (ii) diagnose the malware family to which it is most closely associated, (iii) and provide further evidence as to why the app is considered to be malicious by including a concise description of identified malicious behaviors. We evaluate DroidMiner using 2,466 malicious apps, identified from a corpus of over 67,000 third-party market Android apps, plus an additional set of over 10,000 official market Android apps. Using this set of real-world apps, we demonstrate that DroidMiner achieves a 95.3% detection rate, with only a 0.4% false positive rate. We further evaluate DroidMiner's ability to classify malicious apps under their proper family labels, and measure its label accuracy at 92%.

Lecture Notes in Computer Science, 2014

A critical challenge when combating malware threat is how to efficiently and effectively identify... more A critical challenge when combating malware threat is how to efficiently and effectively identify the targeted victim's environment, given an unknown malware sample. Unfortunately, existing malware analysis techniques either use a limited, fixed set of analysis environments (not effective) or employ expensive, time-consuming multi-path exploration (not efficient), making them not well-suited to solve this challenge. As such, this paper proposes a new dynamic analysis scheme to deal with this problem by applying the concept of speculative execution in this new context. Specifically, by providing multiple dynamically created, parallel, and virtual environment spaces, we speculatively execute a malware sample and adaptively switch to the right environment during the analysis. Interestingly, while our approach appears to trade space for speed, we show that it can actually use less memory space and achieve much higher speed than existing schemes. We have implemented a prototype system, GOLDENEYE, and evaluated it with a large real-world malware dataset. The experimental results show that GOLDENEYE outperforms existing solutions and can effectively and efficiently expose malware's targeted environment, thereby speeding up the analysis in the critical battle against the emerging targeted malware threat.

Proceedings of the 2012 ACM conference on Computer and communications security - CCS '12, 2012

We propose a new, active scheme for fast and reliable detection of P2P malware by exploiting the ... more We propose a new, active scheme for fast and reliable detection of P2P malware by exploiting the enemies' strength against them. Our new scheme works in two phases: hostlevel dynamic binary analysis to automatically extract builtin remotely-accessible/controllable mechanisms (referred to as Malware Control Birthmarks or MCB) in P2P malware, followed by network-level informed probing for detection. Our new design demonstrates a novel combination of the strengths from host-based and network-based approaches. Compared with existing detection solutions, it is fast, reliable, and scalable in its detection scope. Furthermore, it can be applicable to more than just P2P malware, more broadly any malware that opens a service port for network communications (e.g., many Trojans/backdoors). We develop a prototype system, PeerPress, and evaluate it on many representative real-world P2P malware (including Storm, Conficker, and more recent Sality). The results show that it can effectively detect the existence of malware when MCBs are extracted, and the detection occurs in an early stage during which other tools (e.g., BotHunter) typically do not have sufficient information to detect. We further discuss its limitations and implications, and we believe it is a great complement to existing passive detection solutions.

Proceedings of the 2018 Workshop on Security in Softwarized Networks: Prospects and Challenges, 2018

2018 27th International Conference on Computer Communication and Networks (ICCCN), 2018

Detection of Intrusions and Malware, and Vulnerability Assessment, 2019

Journal of Computer and Communications, 2018

Computers & Security, 2016

Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, 2014

2013 IEEE 33rd International Conference on Distributed Computing Systems, 2013

Computer Security - ESORICS 2014, 2014

Lecture Notes in Computer Science, 2012

Proceedings of the 2012 ACM conference on Computer and communications security, 2012

2012 Proceedings IEEE INFOCOM, 2012

Lecture Notes in Computer Science, 2013

As interest in wireless mesh networks grows, security challenges, e.g., intrusion detection, beco... more As interest in wireless mesh networks grows, security challenges, e.g., intrusion detection, become of paramount importance. Traditional solutions for intrusion detection assign full IDS responsibilities to a few selected nodes. Recent results, however, have shown that a mesh router cannot reliably perform full IDS functions because of limited resources (i.e., processing power and memory). Cooperative IDS solutions, targeting resource constrained wireless networks impose high communication overhead and detection latency. To address these challenges, we propose PRIDE (PRactical Intrusion DEtection in resource constrained wireless mesh networks), a non-cooperative real-time intrusion detection scheme that optimally distributes IDS functions to nodes along traffic paths, such that detection rate is maximized, while resource consumption is below a given threshold. We formulate the optimal IDS function distribution as an integer linear program and propose algorithms for solving it accurately and fast (i.e., practical). We evaluate the performance of our proposed solution in a real-world, department-wide, mesh network.

Proceedings 2014 Network and Distributed System Security Symposium, 2014

Computer Networks, 2013

Computer Security - ESORICS 2014, 2014

Lecture Notes in Computer Science, 2014

Proceedings of the 2012 ACM conference on Computer and communications security - CCS '12, 2012