Yennun Huang | Academia Sinica (original) (raw)

Papers by Yennun Huang

Research paper thumbnail of Message from General Co-Chairs

Research paper thumbnail of A histogram statistical method for the detection of localized faults in deep groove ball bearing

MATEC web of conferences, 2017

This study aims to use the histogram statistical method to establish a deep groove ball bearing f... more This study aims to use the histogram statistical method to establish a deep groove ball bearing fault diagnosis strategy. First, statistical indicators are used to excavate the fault characteristics buried in the vibration signal, and use the histogram to define the characteristic area for fault diagnosis. The results show that the indicators 1, 3, 6 have better statistical differences. Based on this, the accuracy of pattern recognition for all test data is 100 %. Finally, the statistical significance of ball damage was significant, and the results showed high correlation (56~73 %). The correlation between inner race damage model was 49~57 % and healthy model was 52 %. As the inner race damage and health model in the statistical sense, there are some similar, so there is a relatively high correlation. In the future research work, it will be committed to mining more representative indicators to enhance the relevance of abnormal characteristics.

Research paper thumbnail of DPARM: Differentially Private Association Rules Mining

IEEE Access, 2020

Association analysis is critical in data analysis performed to find all co-occurrence relationshi... more Association analysis is critical in data analysis performed to find all co-occurrence relationships (i.e., frequent itemsets or confident association rules) from the transactional dataset. An association rule can improve the ability of users to discover patterns and develop corresponding strategies. The data analysis process can be summarized as a set of queries, where each query is a real-valued function of the dataset. However, unless restrictions and protections are implemented, accessing the dataset to answer the queries may lead to the disclosure of the private information of individuals. In this paper, we propose an original differentially private association rules mining (DPARM) algorithm, which uses multiple support thresholds to reduce the number of candidate itemsets while reflecting the real nature of the items and uses random truncation and uniform partition to reduce the dimensionality of the dataset. Both of these elaborated approaches can aid in reducing the sensitivity of the queries, and this dramatically reduces the scale of the required noise and improves the utility of the mining results. We significantly stabilize the noise scale by adaptively allocating the privacy levels and bound the overall privacy loss. Through a series of experiments, we prove that our DPARM algorithm outperforms the literature in the accuracy of data mining while satisfying differential privacy. To the best of our knowledge, our work is the first DPARM algorithm to adopt multiple support thresholds while using a set of elaborated approaches to bound the overall privacy loss of the mining process. INDEX TERMS Privacy-preserving data analysis, differential privacy, association analysis, association rules mining, frequent itemset mining.

Research paper thumbnail of Using distributed resource management in heterogeneous telecomputing platforms

Proceedings of IEEE International Computer Performance and Dependability Symposium

Programmability, reliability, and scalability are system requirements that are essential to retai... more Programmability, reliability, and scalability are system requirements that are essential to retaining or establishing the high ground in the new world order for telecommunications solutions. For telephony systems to be successful in the next century, they must deliver “fifth-generation software flexibility” that enables interoperability while satisfying these 3 key requirements to give customers the applications they need, when they need

Research paper thumbnail of FedSGDCOVID: Federated SGD COVID-19 Detection under Local Differential Privacy Using Chest X-ray Images and Symptom Information

Sensors

Coronavirus (COVID-19) has created an unprecedented global crisis because of its detrimental effe... more Coronavirus (COVID-19) has created an unprecedented global crisis because of its detrimental effect on the global economy and health. COVID-19 cases have been rapidly increasing, with no sign of stopping. As a result, test kits and accurate detection models are in short supply. Early identification of COVID-19 patients will help decrease the infection rate. Thus, developing an automatic algorithm that enables the early detection of COVID-19 is essential. Moreover, patient data are sensitive, and they must be protected to prevent malicious attackers from revealing information through model updates and reconstruction. In this study, we presented a higher privacy-preserving federated learning system for COVID-19 detection without sharing data among data owners. First, we constructed a federated learning system using chest X-ray images and symptom information. The purpose is to develop a decentralized model across multiple hospitals without sharing data. We found that adding the spatial...

Research paper thumbnail of Stock Price Movement Prediction Using Sentiment Analysis and CandleStick Chart Representation

Sensors

Determining the price movement of stocks is a challenging problem to solve because of factors suc... more Determining the price movement of stocks is a challenging problem to solve because of factors such as industry performance, economic variables, investor sentiment, company news, company performance, and social media sentiment. People can predict the price movement of stocks by applying machine learning algorithms on information contained in historical data, stock candlestick-chart data, and social-media data. However, it is hard to predict stock movement based on a single classifier. In this study, we proposed a multichannel collaborative network by incorporating candlestick-chart and social-media data for stock trend predictions. We first extracted the social media sentiment features using the Natural Language Toolkit and sentiment analysis data from Twitter. We then transformed the stock’s historical time series data into a candlestick chart to elucidate patterns in the stock’s movement. Finally, we integrated the stock’s sentiment features and its candlestick chart to predict the...

Research paper thumbnail of An Optimization-Based Orchestrator for Resource Access and Operation Management in Sliced 5G Core Networks

Sensors, 2021

Network slicing is a promising technology that network operators can deploy the services by slice... more Network slicing is a promising technology that network operators can deploy the services by slices with heterogeneous quality of service (QoS) requirements. However, an orchestrator for network operation with efficient slice resource provisioning algorithms is essential. This work stands on Internet service provider (ISP) to design an orchestrator analyzing the critical influencing factors, namely access control, scheduling, and resource migration, to systematically evolve a sustainable network. The scalability and flexibility of resources are jointly considered. The resource management problem is formulated as a mixed-integer programming (MIP) problem. A solution approach based on Lagrangian relaxation (LR) is proposed for the orchestrator to make decisions to satisfy the high QoS applications. It can investigate the resources required for access control within a cost-efficient resource pool and consider allocating or migrating resources efficiently in each network slice. For high ...

Research paper thumbnail of Optimization-Based Resource Management Algorithms with Considerations of Client Satisfaction and High Availability in Elastic 5G Network Slices

Sensors, 2021

A combined edge and core cloud computing environment is a novel solution in 5G network slices. Th... more A combined edge and core cloud computing environment is a novel solution in 5G network slices. The clients’ high availability requirement is a challenge because it limits the possible admission control in front of the edge cloud. This work proposes an orchestrator with a mathematical programming model in a global viewpoint to solve resource management problems and satisfying the clients’ high availability requirements. The proposed Lagrangian relaxation-based approach is adopted to solve the problems at a near-optimal level for increasing the system revenue. A promising and straightforward resource management approach and several experimental cases are used to evaluate the efficiency and effectiveness. Preliminary results are presented as performance evaluations to verify the proposed approach’s suitability for edge and core cloud computing environments. The proposed orchestrator significantly enables the network slicing services and efficiently enhances the clients’ satisfaction of...

Research paper thumbnail of Evaluating the Risk of Data Disclosure Using Noise Estimation for Differential Privacy

2017 IEEE 22nd Pacific Rim International Symposium on Dependable Computing (PRDC), 2017

Differential privacy is a recent notion of data privacy protection, which does not matter even wh... more Differential privacy is a recent notion of data privacy protection, which does not matter even when an attacker has arbitrary background knowledge in advance. Consequently, it is viewed as a reliable protection mechanism for sensitive information. Differential privacy introduces Laplace noise to hide the true value in a dataset while preserving statistic properties. However, the large amount of Laplace noise added into a dataset is typically defined by the discursive scale parameter of the Laplace distribution. The privacy parameter ε in differential privacy is with theoretical interpretation, but the implication on the risk of data disclosure (called RoD for short) in practice has not yet been studied. Moreover, choosing appropriate value for ε is not an easy task since it impacts the level of privacy in a dataset significantly. In this paper, we define and evaluate the RoD in a dataset with either numerical or binary attributes for numerical or counting queries with multiple attri...

Research paper thumbnail of Data-Driven Approach for Evaluating Risk of Disclosure and Utility in Differentially Private Data Release

2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA), 2017

Differential privacy (DP) is a popular technique for protecting individual privacy and at the sam... more Differential privacy (DP) is a popular technique for protecting individual privacy and at the same for releasing data for public use. However, very few research efforts are devoted to the balance between the corresponding risk of data disclosure (RoD) and data utility. In this paper, we propose data-driven approaches for differentially private data release to evaluate RoD, and offer algorithms to evaluate whether the differentially private synthetic dataset has sufficient privacy. In addition to the privacy, the utility of the synthetic dataset is an important metric for differentially private data release. Thus, we also propose the data-driven algorithm via curve fitting to measure and predict the error of the statistical result incurred by random noise added to the original dataset. Finally, we present an algorithm for choosing appropriate privacy budget E with the balance between the privacy and utility.

Research paper thumbnail of De-Identification Technique for Iot Wireless Sensor Network Privacy Protection

Jurnal Ilmu Komputer dan Informasi, 2017

As the IoT ecosystem becoming more and more mature, hardware and software vendors are trying crea... more As the IoT ecosystem becoming more and more mature, hardware and software vendors are trying create new value by connecting all kinds of devices together via IoT. IoT devices are usually equipped with sensors to collect data, and the data collected are transmitted over the air via different kinds of wireless connection. To extract the value of the data collected, the data owner may choose to seek for third-party help on data analysis, or even of the data to the public for more insight. In this scenario it is important to protect the released data from privacy leakage. Here we propose that differential privacy, as a de-identification technique, can be a useful approach to add privacy protection to the data released, as well as to prevent the collected from intercepted and decoded during over-the-air transmission. A way to increase the accuracy of the count queries performed on the edge cases in a synthetic database is also presented in this research.

Research paper thumbnail of Self-Healing Spyware: Detection, and Remediation

IEEE Transactions on Reliability, 2007

Spyware has become a significant threat to most Internet users as it introduces serious privacy d... more Spyware has become a significant threat to most Internet users as it introduces serious privacy disclosure, and potential security breach to the systems. It has not only utilized critical areas of the computer system to survive reboots, but also grown resilient against current anti-spyware tools; they are capable of self-healing themselves against deletion. Because existing anti-spyware tools are stateless in the sense that they do not remember or monitor the spyware programs that were deleted, they fail to remove self-healing spyware from the system completely. This paper proposes a stateful approach that is based on characterizing spyware invasion as a trust information flow problem, and implements STARS (Stateful Threat-Aware Removal System), which is a tool that at run time monitors critical system behaviors, and ensures that removed spyware programs do not reinstall themselves, to enforce information flow policy in the system. If a reinstallation (self-healing) is detected, STARS infers the source of such activities, and discovers additional "suspicious" programs. Experimental results show that STARS is effective in removing self-healing spyware programs that resist removal by existing anti-spyware tools.

Research paper thumbnail of Capacity analysis of MediaGrid: a P2P IPTV platform for fiber to the node (FTTN) networks

IEEE Journal on Selected Areas in Communications, 2007

This paper studies the conditions under which P2P sharing can increase the capacity of IPTV servi... more This paper studies the conditions under which P2P sharing can increase the capacity of IPTV services over FTTN networks. For a typical FTTN network, our study shows a) P2P sharing is not beneficial when the total traffic in a local video office is low; b) P2P sharing increases the load on FTTN switches and routers in local video offices; c) P2P sharing is the most beneficial when the network bottleneck is experienced in the southbound segment of a local video office (equivalently a northbound segment of an FTTN switch); and d) sharing among all FTTN serving communities is not needed when network congestion problems are solved by using some other technologies such as program pre-caching or replication. Based on the analytical results, we design and implement the MediaGrid platform for IPTV services which monitors FTTN network conditions and decides when and how to share videos among peers to maximize the service capacity. Simulations and bounds both validate the potential benefits of the MediaGrid IPTV service platform.

Research paper thumbnail of Scheduling-Aware Data Prefetching for Data Processing Services in Cloud

2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA), 2017

Cloud computing services provide flexible computing and storage resources to process large amount... more Cloud computing services provide flexible computing and storage resources to process large amount of datasets. In-memory techniques keep the frequently used data into faster and more expensive storage media for improving performance of data processing services. Data prefetching aims to move data to low-latency storage media to meet requirements of performance. However, existing mechanisms do not consider how to benefit the data processing applications which do not frequently access the same datasets. Another problem is how to reclaim memory resources without affecting other running applications. In this paper, we provide a Scheduling-Aware Data Prefetching (SADP) mechanism for data processing services in a cloud data center. The SADP includes data prefetching and data eviction mechanisms. It firstly evicts the data from memory to release resources for hosting other data blocks, and then it caches the data that will be used in near future. Finally, real-testbed experiments are perfor...

Research paper thumbnail of SFTopk: Secure Functional Top-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> Query via Untrusted Data Storage

Research paper thumbnail of When is P2P Technology Benecial for IPTV Services?

This paper studies the conditions under which peer-to-peer (P2P) technology may be benecial in pr... more This paper studies the conditions under which peer-to-peer (P2P) technology may be benecial in providing IPTV ser-vices over typical network architectures. It has two major contributions. First, we contrast two network models used to study the performance of such a system: a commonly used logical \Internet as a cloud " model and a \physical " model that re ects the characteristics of the underlying network. Speci cally, we show that the cloud model overlooks impor-tant architectural aspects of the network and may drastically overstate the benets of P2P technology by a factor of 3 or more. Second, we provide a cost-benet analysis of P2P video content delivery focusing on the prot trade-os for dierent pricing/incentive models rather than purely on ca-pacity maximization. In particular, we nd that under high volume of video demand, a P2P built-in incentive model per-forms better than any other model for both high-denition and standard-denition media, while the usage-based mod...

Research paper thumbnail of 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007)

Research paper thumbnail of The choice between delta and shift operators for low-precision data representation

2017 20th Conference of Open Innovations Association (FRUCT)

Low-precision data types for embedded applications reduce the power consumption and enhance the p... more Low-precision data types for embedded applications reduce the power consumption and enhance the price-performance ratio. Inconsistence between the specified accuracy of a designed filter or controller and an imprecise data type can be overcome using the-operator, an alternative to the traditional discrete-time z-operator. Though in many cases it significantly increases accuracy, sometimes it shows no advantage over the shift operator. So the problem of choice between delta and shift operator arises. Therefore, a study on-operator applicability bounds is needed to solve this problem and provide-operator efficient practical use. In this paper we introduce a concept of the-operator applicability criterion. The discrete system implementation technique with discrete-time operator choice is given for the low-precision machine arithmetic.

Research paper thumbnail of A fuzzy condition monitoring system suitable for machine tool spindle motor considering dynamic speed characteristics

2017 International Conference on Fuzzy Theory and Its Applications (iFUZZY)

This paper presents a fuzzy state monitoring system for a machine tool spindle motor, which inclu... more This paper presents a fuzzy state monitoring system for a machine tool spindle motor, which includes the statistical characteristics of the dynamic speed. Therefore, we use statistical methods for vibration signal mining, and find the corresponding actual speed value from the spectrum. Then the statistical properties of each machine state are defined from the distribution of significant indexes, and the rules of fuzzy inference system. The experimental results show that the proposed strategy not only effectively identifies the different states of the spindle motor, but also has the adaptability under the frequency conversion.

Research paper thumbnail of A Signal Fusion-based ANN Algorithm for Fault Diagnosis of Rotating Machinery

2020 International Conference on System Science and Engineering (ICSSE), 2020

This research aims to propose a signal fusion-based artificial neural network algorithm for fault... more This research aims to propose a signal fusion-based artificial neural network algorithm for fault diagnosis of rotating machinery. Firstly, the fused signal becomes the running track, and then it is scaled up to grasp the subtle features. However, after calculating the weights, the characteristic distribution of each operating state is obtained. In this way, the fused signal has more prominent characteristics. The experimental results show that pattern recognition networks and feedforward networks have relatively stable and excellent performance. In different cases, the accuracy is maintained at 94~100 %, and the calculation cost is 1~25 seconds. In future research, more system parameters and optimization of the algorithm are considered. It is expected that the robustness of the algorithm will be improved.

Research paper thumbnail of Message from General Co-Chairs

Research paper thumbnail of A histogram statistical method for the detection of localized faults in deep groove ball bearing

MATEC web of conferences, 2017

This study aims to use the histogram statistical method to establish a deep groove ball bearing f... more This study aims to use the histogram statistical method to establish a deep groove ball bearing fault diagnosis strategy. First, statistical indicators are used to excavate the fault characteristics buried in the vibration signal, and use the histogram to define the characteristic area for fault diagnosis. The results show that the indicators 1, 3, 6 have better statistical differences. Based on this, the accuracy of pattern recognition for all test data is 100 %. Finally, the statistical significance of ball damage was significant, and the results showed high correlation (56~73 %). The correlation between inner race damage model was 49~57 % and healthy model was 52 %. As the inner race damage and health model in the statistical sense, there are some similar, so there is a relatively high correlation. In the future research work, it will be committed to mining more representative indicators to enhance the relevance of abnormal characteristics.

Research paper thumbnail of DPARM: Differentially Private Association Rules Mining

IEEE Access, 2020

Association analysis is critical in data analysis performed to find all co-occurrence relationshi... more Association analysis is critical in data analysis performed to find all co-occurrence relationships (i.e., frequent itemsets or confident association rules) from the transactional dataset. An association rule can improve the ability of users to discover patterns and develop corresponding strategies. The data analysis process can be summarized as a set of queries, where each query is a real-valued function of the dataset. However, unless restrictions and protections are implemented, accessing the dataset to answer the queries may lead to the disclosure of the private information of individuals. In this paper, we propose an original differentially private association rules mining (DPARM) algorithm, which uses multiple support thresholds to reduce the number of candidate itemsets while reflecting the real nature of the items and uses random truncation and uniform partition to reduce the dimensionality of the dataset. Both of these elaborated approaches can aid in reducing the sensitivity of the queries, and this dramatically reduces the scale of the required noise and improves the utility of the mining results. We significantly stabilize the noise scale by adaptively allocating the privacy levels and bound the overall privacy loss. Through a series of experiments, we prove that our DPARM algorithm outperforms the literature in the accuracy of data mining while satisfying differential privacy. To the best of our knowledge, our work is the first DPARM algorithm to adopt multiple support thresholds while using a set of elaborated approaches to bound the overall privacy loss of the mining process. INDEX TERMS Privacy-preserving data analysis, differential privacy, association analysis, association rules mining, frequent itemset mining.

Research paper thumbnail of Using distributed resource management in heterogeneous telecomputing platforms

Proceedings of IEEE International Computer Performance and Dependability Symposium

Programmability, reliability, and scalability are system requirements that are essential to retai... more Programmability, reliability, and scalability are system requirements that are essential to retaining or establishing the high ground in the new world order for telecommunications solutions. For telephony systems to be successful in the next century, they must deliver “fifth-generation software flexibility” that enables interoperability while satisfying these 3 key requirements to give customers the applications they need, when they need

Research paper thumbnail of FedSGDCOVID: Federated SGD COVID-19 Detection under Local Differential Privacy Using Chest X-ray Images and Symptom Information

Sensors

Coronavirus (COVID-19) has created an unprecedented global crisis because of its detrimental effe... more Coronavirus (COVID-19) has created an unprecedented global crisis because of its detrimental effect on the global economy and health. COVID-19 cases have been rapidly increasing, with no sign of stopping. As a result, test kits and accurate detection models are in short supply. Early identification of COVID-19 patients will help decrease the infection rate. Thus, developing an automatic algorithm that enables the early detection of COVID-19 is essential. Moreover, patient data are sensitive, and they must be protected to prevent malicious attackers from revealing information through model updates and reconstruction. In this study, we presented a higher privacy-preserving federated learning system for COVID-19 detection without sharing data among data owners. First, we constructed a federated learning system using chest X-ray images and symptom information. The purpose is to develop a decentralized model across multiple hospitals without sharing data. We found that adding the spatial...

Research paper thumbnail of Stock Price Movement Prediction Using Sentiment Analysis and CandleStick Chart Representation

Sensors

Determining the price movement of stocks is a challenging problem to solve because of factors suc... more Determining the price movement of stocks is a challenging problem to solve because of factors such as industry performance, economic variables, investor sentiment, company news, company performance, and social media sentiment. People can predict the price movement of stocks by applying machine learning algorithms on information contained in historical data, stock candlestick-chart data, and social-media data. However, it is hard to predict stock movement based on a single classifier. In this study, we proposed a multichannel collaborative network by incorporating candlestick-chart and social-media data for stock trend predictions. We first extracted the social media sentiment features using the Natural Language Toolkit and sentiment analysis data from Twitter. We then transformed the stock’s historical time series data into a candlestick chart to elucidate patterns in the stock’s movement. Finally, we integrated the stock’s sentiment features and its candlestick chart to predict the...

Research paper thumbnail of An Optimization-Based Orchestrator for Resource Access and Operation Management in Sliced 5G Core Networks

Sensors, 2021

Network slicing is a promising technology that network operators can deploy the services by slice... more Network slicing is a promising technology that network operators can deploy the services by slices with heterogeneous quality of service (QoS) requirements. However, an orchestrator for network operation with efficient slice resource provisioning algorithms is essential. This work stands on Internet service provider (ISP) to design an orchestrator analyzing the critical influencing factors, namely access control, scheduling, and resource migration, to systematically evolve a sustainable network. The scalability and flexibility of resources are jointly considered. The resource management problem is formulated as a mixed-integer programming (MIP) problem. A solution approach based on Lagrangian relaxation (LR) is proposed for the orchestrator to make decisions to satisfy the high QoS applications. It can investigate the resources required for access control within a cost-efficient resource pool and consider allocating or migrating resources efficiently in each network slice. For high ...

Research paper thumbnail of Optimization-Based Resource Management Algorithms with Considerations of Client Satisfaction and High Availability in Elastic 5G Network Slices

Sensors, 2021

A combined edge and core cloud computing environment is a novel solution in 5G network slices. Th... more A combined edge and core cloud computing environment is a novel solution in 5G network slices. The clients’ high availability requirement is a challenge because it limits the possible admission control in front of the edge cloud. This work proposes an orchestrator with a mathematical programming model in a global viewpoint to solve resource management problems and satisfying the clients’ high availability requirements. The proposed Lagrangian relaxation-based approach is adopted to solve the problems at a near-optimal level for increasing the system revenue. A promising and straightforward resource management approach and several experimental cases are used to evaluate the efficiency and effectiveness. Preliminary results are presented as performance evaluations to verify the proposed approach’s suitability for edge and core cloud computing environments. The proposed orchestrator significantly enables the network slicing services and efficiently enhances the clients’ satisfaction of...

Research paper thumbnail of Evaluating the Risk of Data Disclosure Using Noise Estimation for Differential Privacy

2017 IEEE 22nd Pacific Rim International Symposium on Dependable Computing (PRDC), 2017

Differential privacy is a recent notion of data privacy protection, which does not matter even wh... more Differential privacy is a recent notion of data privacy protection, which does not matter even when an attacker has arbitrary background knowledge in advance. Consequently, it is viewed as a reliable protection mechanism for sensitive information. Differential privacy introduces Laplace noise to hide the true value in a dataset while preserving statistic properties. However, the large amount of Laplace noise added into a dataset is typically defined by the discursive scale parameter of the Laplace distribution. The privacy parameter ε in differential privacy is with theoretical interpretation, but the implication on the risk of data disclosure (called RoD for short) in practice has not yet been studied. Moreover, choosing appropriate value for ε is not an easy task since it impacts the level of privacy in a dataset significantly. In this paper, we define and evaluate the RoD in a dataset with either numerical or binary attributes for numerical or counting queries with multiple attri...

Research paper thumbnail of Data-Driven Approach for Evaluating Risk of Disclosure and Utility in Differentially Private Data Release

2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA), 2017

Differential privacy (DP) is a popular technique for protecting individual privacy and at the sam... more Differential privacy (DP) is a popular technique for protecting individual privacy and at the same for releasing data for public use. However, very few research efforts are devoted to the balance between the corresponding risk of data disclosure (RoD) and data utility. In this paper, we propose data-driven approaches for differentially private data release to evaluate RoD, and offer algorithms to evaluate whether the differentially private synthetic dataset has sufficient privacy. In addition to the privacy, the utility of the synthetic dataset is an important metric for differentially private data release. Thus, we also propose the data-driven algorithm via curve fitting to measure and predict the error of the statistical result incurred by random noise added to the original dataset. Finally, we present an algorithm for choosing appropriate privacy budget E with the balance between the privacy and utility.

Research paper thumbnail of De-Identification Technique for Iot Wireless Sensor Network Privacy Protection

Jurnal Ilmu Komputer dan Informasi, 2017

As the IoT ecosystem becoming more and more mature, hardware and software vendors are trying crea... more As the IoT ecosystem becoming more and more mature, hardware and software vendors are trying create new value by connecting all kinds of devices together via IoT. IoT devices are usually equipped with sensors to collect data, and the data collected are transmitted over the air via different kinds of wireless connection. To extract the value of the data collected, the data owner may choose to seek for third-party help on data analysis, or even of the data to the public for more insight. In this scenario it is important to protect the released data from privacy leakage. Here we propose that differential privacy, as a de-identification technique, can be a useful approach to add privacy protection to the data released, as well as to prevent the collected from intercepted and decoded during over-the-air transmission. A way to increase the accuracy of the count queries performed on the edge cases in a synthetic database is also presented in this research.

Research paper thumbnail of Self-Healing Spyware: Detection, and Remediation

IEEE Transactions on Reliability, 2007

Spyware has become a significant threat to most Internet users as it introduces serious privacy d... more Spyware has become a significant threat to most Internet users as it introduces serious privacy disclosure, and potential security breach to the systems. It has not only utilized critical areas of the computer system to survive reboots, but also grown resilient against current anti-spyware tools; they are capable of self-healing themselves against deletion. Because existing anti-spyware tools are stateless in the sense that they do not remember or monitor the spyware programs that were deleted, they fail to remove self-healing spyware from the system completely. This paper proposes a stateful approach that is based on characterizing spyware invasion as a trust information flow problem, and implements STARS (Stateful Threat-Aware Removal System), which is a tool that at run time monitors critical system behaviors, and ensures that removed spyware programs do not reinstall themselves, to enforce information flow policy in the system. If a reinstallation (self-healing) is detected, STARS infers the source of such activities, and discovers additional "suspicious" programs. Experimental results show that STARS is effective in removing self-healing spyware programs that resist removal by existing anti-spyware tools.

Research paper thumbnail of Capacity analysis of MediaGrid: a P2P IPTV platform for fiber to the node (FTTN) networks

IEEE Journal on Selected Areas in Communications, 2007

This paper studies the conditions under which P2P sharing can increase the capacity of IPTV servi... more This paper studies the conditions under which P2P sharing can increase the capacity of IPTV services over FTTN networks. For a typical FTTN network, our study shows a) P2P sharing is not beneficial when the total traffic in a local video office is low; b) P2P sharing increases the load on FTTN switches and routers in local video offices; c) P2P sharing is the most beneficial when the network bottleneck is experienced in the southbound segment of a local video office (equivalently a northbound segment of an FTTN switch); and d) sharing among all FTTN serving communities is not needed when network congestion problems are solved by using some other technologies such as program pre-caching or replication. Based on the analytical results, we design and implement the MediaGrid platform for IPTV services which monitors FTTN network conditions and decides when and how to share videos among peers to maximize the service capacity. Simulations and bounds both validate the potential benefits of the MediaGrid IPTV service platform.

Research paper thumbnail of Scheduling-Aware Data Prefetching for Data Processing Services in Cloud

2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA), 2017

Cloud computing services provide flexible computing and storage resources to process large amount... more Cloud computing services provide flexible computing and storage resources to process large amount of datasets. In-memory techniques keep the frequently used data into faster and more expensive storage media for improving performance of data processing services. Data prefetching aims to move data to low-latency storage media to meet requirements of performance. However, existing mechanisms do not consider how to benefit the data processing applications which do not frequently access the same datasets. Another problem is how to reclaim memory resources without affecting other running applications. In this paper, we provide a Scheduling-Aware Data Prefetching (SADP) mechanism for data processing services in a cloud data center. The SADP includes data prefetching and data eviction mechanisms. It firstly evicts the data from memory to release resources for hosting other data blocks, and then it caches the data that will be used in near future. Finally, real-testbed experiments are perfor...

Research paper thumbnail of SFTopk: Secure Functional Top-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> Query via Untrusted Data Storage

Research paper thumbnail of When is P2P Technology Benecial for IPTV Services?

This paper studies the conditions under which peer-to-peer (P2P) technology may be benecial in pr... more This paper studies the conditions under which peer-to-peer (P2P) technology may be benecial in providing IPTV ser-vices over typical network architectures. It has two major contributions. First, we contrast two network models used to study the performance of such a system: a commonly used logical \Internet as a cloud " model and a \physical " model that re ects the characteristics of the underlying network. Speci cally, we show that the cloud model overlooks impor-tant architectural aspects of the network and may drastically overstate the benets of P2P technology by a factor of 3 or more. Second, we provide a cost-benet analysis of P2P video content delivery focusing on the prot trade-os for dierent pricing/incentive models rather than purely on ca-pacity maximization. In particular, we nd that under high volume of video demand, a P2P built-in incentive model per-forms better than any other model for both high-denition and standard-denition media, while the usage-based mod...

Research paper thumbnail of 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007)

Research paper thumbnail of The choice between delta and shift operators for low-precision data representation

2017 20th Conference of Open Innovations Association (FRUCT)

Low-precision data types for embedded applications reduce the power consumption and enhance the p... more Low-precision data types for embedded applications reduce the power consumption and enhance the price-performance ratio. Inconsistence between the specified accuracy of a designed filter or controller and an imprecise data type can be overcome using the-operator, an alternative to the traditional discrete-time z-operator. Though in many cases it significantly increases accuracy, sometimes it shows no advantage over the shift operator. So the problem of choice between delta and shift operator arises. Therefore, a study on-operator applicability bounds is needed to solve this problem and provide-operator efficient practical use. In this paper we introduce a concept of the-operator applicability criterion. The discrete system implementation technique with discrete-time operator choice is given for the low-precision machine arithmetic.

Research paper thumbnail of A fuzzy condition monitoring system suitable for machine tool spindle motor considering dynamic speed characteristics

2017 International Conference on Fuzzy Theory and Its Applications (iFUZZY)

This paper presents a fuzzy state monitoring system for a machine tool spindle motor, which inclu... more This paper presents a fuzzy state monitoring system for a machine tool spindle motor, which includes the statistical characteristics of the dynamic speed. Therefore, we use statistical methods for vibration signal mining, and find the corresponding actual speed value from the spectrum. Then the statistical properties of each machine state are defined from the distribution of significant indexes, and the rules of fuzzy inference system. The experimental results show that the proposed strategy not only effectively identifies the different states of the spindle motor, but also has the adaptability under the frequency conversion.

Research paper thumbnail of A Signal Fusion-based ANN Algorithm for Fault Diagnosis of Rotating Machinery

2020 International Conference on System Science and Engineering (ICSSE), 2020

This research aims to propose a signal fusion-based artificial neural network algorithm for fault... more This research aims to propose a signal fusion-based artificial neural network algorithm for fault diagnosis of rotating machinery. Firstly, the fused signal becomes the running track, and then it is scaled up to grasp the subtle features. However, after calculating the weights, the characteristic distribution of each operating state is obtained. In this way, the fused signal has more prominent characteristics. The experimental results show that pattern recognition networks and feedforward networks have relatively stable and excellent performance. In different cases, the accuracy is maintained at 94~100 %, and the calculation cost is 1~25 seconds. In future research, more system parameters and optimization of the algorithm are considered. It is expected that the robustness of the algorithm will be improved.