Zhiyuan Luo | Royal Holloway, University of London (original) (raw)

Papers by Zhiyuan Luo

Research paper thumbnail of Empirical Inference: Festschrift in Honor of Vladimir N. Vapnik

Research paper thumbnail of Boost AI Power: Data Augmentation Strategies with unlabelled Data and Conformal Prediction, a Case in Alternative Herbal Medicine Discrimination with Electronic Nose

arXiv (Cornell University), Feb 5, 2021

Research paper thumbnail of Recurrent Auto-Encoder Model for Multidimensional Time Series Representation

Research paper thumbnail of Online conformal prediction for classifying different types of herbal medicines with electronic nose

IET Doctoral Forum on Biomedical Engineering, Healthcare, Robotics and Artificial Intelligence 2018 (BRAIN 2018), 2018

With the recognition of herbal medicines, reliable and convenient methods for herbal medicines di... more With the recognition of herbal medicines, reliable and convenient methods for herbal medicines discrimination are needed. This paper introduces a novel method of using an electronic nose with online conformal prediction to classify 12 different types of herbal medicines with similar appearance. The performances of different online conformal predictors based on different training set updating strategies and varied sizes of initial training sets are evaluated to investigate the effectiveness of online conformal prediction. The results show that online conformal prediction manages to classify these medicines and achieves improved accuracy and robustness with more observations if the reliability requirement for training set updating is strict enough. Furthermore, the validity of online conformal prediction is vindicated that with the accumulation of observations, the error rate of prediction gradually converges below the significance level set by users, which offers users a flexible control over reliability and information about potential risk. Finally, the efficiency of online conformal prediction is discussed that customers should make a trade-off between reliability and efficiency.

Research paper thumbnail of Speech neuromuscular decoding based on spectrogram images using conformal predictors with Bi-LSTM

Neurocomputing, 2021

Abstract The relationships between muscle movements and neural signals make it possible to decode... more Abstract The relationships between muscle movements and neural signals make it possible to decode silent speech based on neuromuscular activities. The decoding can be formulated as a supervised classification task. The electromyography (EMG) captured from surface articulatory muscles contains useful information that can help assist in decoding of speech. Spectrograms obtained from EMG have a wealth of information relating to the decoding, but have not yet been fully explored. In addition, the decoding results are often uncertain. Therefore, it is important to quantify the prediction confidence. This paper aims to improve the decoding performance by representing time series signals as spectrograms and utilising Inductive Conformal Prediction (ICP) to provide predictions with confidence. All EMG data are recorded on six dedicated facial muscles while participants recite the displayed words subvocally. Three pre-trained convolutional models of MobileNet-V1, ResNet18 and Xception are used to extract features from spectrograms for classification. Both bidirectional Long-Short Time Memory (Bi-LSTM) and Gate Recurrent Unit (GRU) classifiers are used for prediction. Furthermore, an ICP decoder based on Bi-LSTM is built to provide guaranteed predictions for each example at a specified confidence level. The proposed method of combining feature extraction based on Xception and classification using Bi-LSTM gives a higher accuracy of 0.87 than other methods. ICP outputs confidence measurements for each example that can help users to evaluate the reliability of new predictions. Experimental results demonstrate the practical usefulness in decoding articulatory neuromuscular activity and the advantages in applying ICP.

Research paper thumbnail of An all-solid-state Ion-selective Electrode for Dopamine Determination

IET Doctoral Forum on Biomedical Engineering, Healthcare, Robotics and Artificial Intelligence 2018 (BRAIN 2018), 2018

Research paper thumbnail of Gene Selection for Cancer Classification using Wilcoxon Rank Sum Test and Support Vector Machine

2006 International Conference on Computational Intelligence and Security, 2006

Gene selection is an important problem in microarray data processing. A new gene selection method... more Gene selection is an important problem in microarray data processing. A new gene selection method based on Wilcoxon rank sum test and support vector machine (SVM) is proposed in this paper. First, Wilcoxon rank sum test is used to select a subset. Then each selected gene is trained and tested using SVM classifier with linear kernel separately, and genes with

Research paper thumbnail of A Multi-Agent Approach for Distributed Broadband Network Management

Research paper thumbnail of Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis

Communications in computer and information science, 2018

Research paper thumbnail of Selective Mixture of Gaussians Clustering for Location Fingerprinting

One of the challenges of location fingerprinting to be deployed in the real offices is the traini... more One of the challenges of location fingerprinting to be deployed in the real offices is the training database handling process, which does not scale well with increasing amount of tracking space to be covered. However, little attention was paid to tackle such issue, where the majority of previous work rather focused on improving the tracking accuracy. In this paper, we propose a novel idea to enhance fingerprinting's processing speed and positioning accuracy with mixture of Gaussians clustering. We realised the key difference between fingerprinting and other un-supervised problems, that is we do know the label (the Cartesian coordinate) of the signal data in advance. This key information was largely ignored in previous work, where the fingerprinting clustering was based solely on the signal data information. By exploiting this information, we tackle the indoor signal multipath and shadowing with two-level signal data clustering and Cartesian coordinate clustering. We tested our approach in a real office environment with harsh indoor condition, and concluded that our clustering scheme does not only reduce the fingerprinting processing time, but also improves the positioning accuracy.

Research paper thumbnail of Coreset-Based Data Compression for Logistic Regression

Springer eBooks, 2021

The coreset paradigm is a fundamental tool for analysing complex and large datasets. Although cor... more The coreset paradigm is a fundamental tool for analysing complex and large datasets. Although coresets are used as an acceleration technique for many learning problems, the algorithms used for constructing them may become computationally exhaustive in some settings. We show that this can easily happen when computing coresets for learning a logistic regression classifier. We overcome this issue with two methods: Accelerating Clustering via Sampling (ACvS) and Regressed Data Summarisation Framework (RDSF); the former is an acceleration procedure based on a simple theoretical observation on using Uniform Random Sampling for clustering problems, the latter is a coreset-based data-summarising framework that builds on ACvS and extend it by using a regression algorithm as part of the construction. We tested both procedures on five public datasets, and observed that computing the coreset and learning from it, is 11 times faster than learning directly from the full input data in the worst case, and 34 times faster in the best case. We further observed that the best regression algorithm for creating summaries of data using the RDSF framework is the Ordinary Least Squares (OLS).

Research paper thumbnail of An optimized multi-classifiers ensemble learning for identification of ginsengs based on electronic nose

Sensors and Actuators A-physical, Oct 1, 2017

 Optimized two-layer Adaboost.M2 ensemble model is constructed for the identification of Chinese... more  Optimized two-layer Adaboost.M2 ensemble model is constructed for the identification of Chinese herbal medicine based on electronic nose;  The framework integrates with several classical classifiers in probabilistic forms and utilizes the diversity among them;  Efficient algebraic fusion rules are employed for combining decisions from classifiers;  The method contributes to a flexible tool to make valid probabilistic and precise prediction for electronic nose applications, and a feasible solution for online classification.

Research paper thumbnail of A Multi-Scale Feature Selection Framework for WiFi Access Points Line-of-sight Identification

Research paper thumbnail of Malware in Motion

Malicious software (malware) is designed to circumvent the security policy of the host device. Sm... more Malicious software (malware) is designed to circumvent the security policy of the host device. Smartphones represent an attractive target to malware authors as they are often a rich source of sensitive information. Attractive targets for attackers are sensors (such as cameras or microphones) which allow observation of the victims in real time. To counteract this threat, there has been a tightening of privileges on mobile devices with respect to sensors, with app developers being required to declare which sensors they need access to, as well as the users needing to give consent. We demonstrate by conducting a survey of publicly accessible malware analysis platforms that there are still implementations of sensors which are trivial to detect without exposing the malicious intent of a program. We also show how that, despite changes to the permission model, it is still possible to fingerprint an analysis environment even when the analysis is carried out using a physical device with the novel use of Android's Activity Recognition API.

Research paper thumbnail of Co-location Epidemic Tracking on London Public Transports Using Low Power Mobile Magnetometer

arXiv (Cornell University), Apr 1, 2017

The public transports provide an ideal means to enable contagious diseases transmission. This pap... more The public transports provide an ideal means to enable contagious diseases transmission. This paper introduces a novel idea to detect co-location of people in such environment using just the ubiquitous geomagnetic field sensor on the smart phone. Essentially, given that all passengers must share the same journey between at least two consecutive stations, we have a long window to match the user trajectory. Our idea was assessed over a painstakingly survey of over 150 kilometres of travelling distance, covering different parts of London, using the overground trains, the underground tubes and the buses.

Research paper thumbnail of Location Tracking Using Smartphone Accelerometer and Magnetometer Traces

Proceedings of the 14th International Conference on Availability, Reliability and Security

We demonstrate a breach in smartphone location privacy through the accelerometer and magnetometer... more We demonstrate a breach in smartphone location privacy through the accelerometer and magnetometer's footprints. The merits or otherwise of explicitly permissioned location sensors are not the point of this paper. Instead, our proposition is that other non-locationsensitive sensors can track users accurately when the users are in motion, as in travelling on public transport, such as trains, buses, and taxis. Through field trials, we provide evidence that high accuracy location tracking can be achieved even via non-locationsensitive sensors for which no access authorisation is required from users on a smartphone.

Research paper thumbnail of Preface for the Proceedings of Machine Learning Research Volume 152

10th Symposium on Conformal and Probabilistic Prediction with Applications, Sep 10, 2021

Research paper thumbnail of A Glucose Sensor Based on Glucose Oxidase Immobilized by Electrospinning Nanofibrous Polymer Membranes Modified with Carbon Nanotubes

Sensors & Transducers, 2013

A glucose biosensor based on glucose oxidase immobilized by electrospinning nanofibrous membranes... more A glucose biosensor based on glucose oxidase immobilized by electrospinning nanofibrous membranes has been developed. Nanofibrous membranes were electrospun from the solution of poly(acrylonitrile-co-acrylic acid) containing carbon nanotubes suspension and directly deposited on Pt electrodes for immobilizing glucose oxidase. The morphologies and structure of the nanofibrous membranes with or without carbon nanotubes were characterized by scanning electron microscopy. The fabrication parameters of nanofibers were optimized such as thickness of the nanofibrous membranes and mass ration of carbon nanotubes. The biosensor showed the relationship with a concentration range of 0.1–10 mM and response time was 60 s. The sensitivity of carbon nanotubes modified biosensors was two times larger than which of no carbon nanotubes modified ones. The pH effect, interference and lifetime of biosensors were discussed.

Research paper thumbnail of Special Issue on Conformal and Probabilistic Prediction with Applications: Preface

Pattern Recognition, 2022

Research paper thumbnail of Twenty seconds of visual behaviour on social media gives insight into personality

Scientific Reports, 2022

Eye tracking allows the researcher to capture individual differences in the expression of visual ... more Eye tracking allows the researcher to capture individual differences in the expression of visual exploration behaviour, which in certain contexts has been found to reflect aspects of the user’s preferences and personality. In a novel approach, we recorded the eye movements of 180 participants whilst they browsed their Facebook News Feed and employed a machine learning approach to predict each of the self-reported Big Five personality traits from this viewing behaviour. We identify that specific visual behaviours are informative of an individual’s personality trait information, and can be used to psychologically profile social networking site users significantly better than chance after collecting only 20 seconds of viewing behaviour. We discuss potential applications for user engagement during human–computer interactions, and highlight potential privacy concerns.

Research paper thumbnail of Empirical Inference: Festschrift in Honor of Vladimir N. Vapnik

Research paper thumbnail of Boost AI Power: Data Augmentation Strategies with unlabelled Data and Conformal Prediction, a Case in Alternative Herbal Medicine Discrimination with Electronic Nose

arXiv (Cornell University), Feb 5, 2021

Research paper thumbnail of Recurrent Auto-Encoder Model for Multidimensional Time Series Representation

Research paper thumbnail of Online conformal prediction for classifying different types of herbal medicines with electronic nose

IET Doctoral Forum on Biomedical Engineering, Healthcare, Robotics and Artificial Intelligence 2018 (BRAIN 2018), 2018

With the recognition of herbal medicines, reliable and convenient methods for herbal medicines di... more With the recognition of herbal medicines, reliable and convenient methods for herbal medicines discrimination are needed. This paper introduces a novel method of using an electronic nose with online conformal prediction to classify 12 different types of herbal medicines with similar appearance. The performances of different online conformal predictors based on different training set updating strategies and varied sizes of initial training sets are evaluated to investigate the effectiveness of online conformal prediction. The results show that online conformal prediction manages to classify these medicines and achieves improved accuracy and robustness with more observations if the reliability requirement for training set updating is strict enough. Furthermore, the validity of online conformal prediction is vindicated that with the accumulation of observations, the error rate of prediction gradually converges below the significance level set by users, which offers users a flexible control over reliability and information about potential risk. Finally, the efficiency of online conformal prediction is discussed that customers should make a trade-off between reliability and efficiency.

Research paper thumbnail of Speech neuromuscular decoding based on spectrogram images using conformal predictors with Bi-LSTM

Neurocomputing, 2021

Abstract The relationships between muscle movements and neural signals make it possible to decode... more Abstract The relationships between muscle movements and neural signals make it possible to decode silent speech based on neuromuscular activities. The decoding can be formulated as a supervised classification task. The electromyography (EMG) captured from surface articulatory muscles contains useful information that can help assist in decoding of speech. Spectrograms obtained from EMG have a wealth of information relating to the decoding, but have not yet been fully explored. In addition, the decoding results are often uncertain. Therefore, it is important to quantify the prediction confidence. This paper aims to improve the decoding performance by representing time series signals as spectrograms and utilising Inductive Conformal Prediction (ICP) to provide predictions with confidence. All EMG data are recorded on six dedicated facial muscles while participants recite the displayed words subvocally. Three pre-trained convolutional models of MobileNet-V1, ResNet18 and Xception are used to extract features from spectrograms for classification. Both bidirectional Long-Short Time Memory (Bi-LSTM) and Gate Recurrent Unit (GRU) classifiers are used for prediction. Furthermore, an ICP decoder based on Bi-LSTM is built to provide guaranteed predictions for each example at a specified confidence level. The proposed method of combining feature extraction based on Xception and classification using Bi-LSTM gives a higher accuracy of 0.87 than other methods. ICP outputs confidence measurements for each example that can help users to evaluate the reliability of new predictions. Experimental results demonstrate the practical usefulness in decoding articulatory neuromuscular activity and the advantages in applying ICP.

Research paper thumbnail of An all-solid-state Ion-selective Electrode for Dopamine Determination

IET Doctoral Forum on Biomedical Engineering, Healthcare, Robotics and Artificial Intelligence 2018 (BRAIN 2018), 2018

Research paper thumbnail of Gene Selection for Cancer Classification using Wilcoxon Rank Sum Test and Support Vector Machine

2006 International Conference on Computational Intelligence and Security, 2006

Gene selection is an important problem in microarray data processing. A new gene selection method... more Gene selection is an important problem in microarray data processing. A new gene selection method based on Wilcoxon rank sum test and support vector machine (SVM) is proposed in this paper. First, Wilcoxon rank sum test is used to select a subset. Then each selected gene is trained and tested using SVM classifier with linear kernel separately, and genes with

Research paper thumbnail of A Multi-Agent Approach for Distributed Broadband Network Management

Research paper thumbnail of Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis

Communications in computer and information science, 2018

Research paper thumbnail of Selective Mixture of Gaussians Clustering for Location Fingerprinting

One of the challenges of location fingerprinting to be deployed in the real offices is the traini... more One of the challenges of location fingerprinting to be deployed in the real offices is the training database handling process, which does not scale well with increasing amount of tracking space to be covered. However, little attention was paid to tackle such issue, where the majority of previous work rather focused on improving the tracking accuracy. In this paper, we propose a novel idea to enhance fingerprinting's processing speed and positioning accuracy with mixture of Gaussians clustering. We realised the key difference between fingerprinting and other un-supervised problems, that is we do know the label (the Cartesian coordinate) of the signal data in advance. This key information was largely ignored in previous work, where the fingerprinting clustering was based solely on the signal data information. By exploiting this information, we tackle the indoor signal multipath and shadowing with two-level signal data clustering and Cartesian coordinate clustering. We tested our approach in a real office environment with harsh indoor condition, and concluded that our clustering scheme does not only reduce the fingerprinting processing time, but also improves the positioning accuracy.

Research paper thumbnail of Coreset-Based Data Compression for Logistic Regression

Springer eBooks, 2021

The coreset paradigm is a fundamental tool for analysing complex and large datasets. Although cor... more The coreset paradigm is a fundamental tool for analysing complex and large datasets. Although coresets are used as an acceleration technique for many learning problems, the algorithms used for constructing them may become computationally exhaustive in some settings. We show that this can easily happen when computing coresets for learning a logistic regression classifier. We overcome this issue with two methods: Accelerating Clustering via Sampling (ACvS) and Regressed Data Summarisation Framework (RDSF); the former is an acceleration procedure based on a simple theoretical observation on using Uniform Random Sampling for clustering problems, the latter is a coreset-based data-summarising framework that builds on ACvS and extend it by using a regression algorithm as part of the construction. We tested both procedures on five public datasets, and observed that computing the coreset and learning from it, is 11 times faster than learning directly from the full input data in the worst case, and 34 times faster in the best case. We further observed that the best regression algorithm for creating summaries of data using the RDSF framework is the Ordinary Least Squares (OLS).

Research paper thumbnail of An optimized multi-classifiers ensemble learning for identification of ginsengs based on electronic nose

Sensors and Actuators A-physical, Oct 1, 2017

 Optimized two-layer Adaboost.M2 ensemble model is constructed for the identification of Chinese... more  Optimized two-layer Adaboost.M2 ensemble model is constructed for the identification of Chinese herbal medicine based on electronic nose;  The framework integrates with several classical classifiers in probabilistic forms and utilizes the diversity among them;  Efficient algebraic fusion rules are employed for combining decisions from classifiers;  The method contributes to a flexible tool to make valid probabilistic and precise prediction for electronic nose applications, and a feasible solution for online classification.

Research paper thumbnail of A Multi-Scale Feature Selection Framework for WiFi Access Points Line-of-sight Identification

Research paper thumbnail of Malware in Motion

Malicious software (malware) is designed to circumvent the security policy of the host device. Sm... more Malicious software (malware) is designed to circumvent the security policy of the host device. Smartphones represent an attractive target to malware authors as they are often a rich source of sensitive information. Attractive targets for attackers are sensors (such as cameras or microphones) which allow observation of the victims in real time. To counteract this threat, there has been a tightening of privileges on mobile devices with respect to sensors, with app developers being required to declare which sensors they need access to, as well as the users needing to give consent. We demonstrate by conducting a survey of publicly accessible malware analysis platforms that there are still implementations of sensors which are trivial to detect without exposing the malicious intent of a program. We also show how that, despite changes to the permission model, it is still possible to fingerprint an analysis environment even when the analysis is carried out using a physical device with the novel use of Android's Activity Recognition API.

Research paper thumbnail of Co-location Epidemic Tracking on London Public Transports Using Low Power Mobile Magnetometer

arXiv (Cornell University), Apr 1, 2017

The public transports provide an ideal means to enable contagious diseases transmission. This pap... more The public transports provide an ideal means to enable contagious diseases transmission. This paper introduces a novel idea to detect co-location of people in such environment using just the ubiquitous geomagnetic field sensor on the smart phone. Essentially, given that all passengers must share the same journey between at least two consecutive stations, we have a long window to match the user trajectory. Our idea was assessed over a painstakingly survey of over 150 kilometres of travelling distance, covering different parts of London, using the overground trains, the underground tubes and the buses.

Research paper thumbnail of Location Tracking Using Smartphone Accelerometer and Magnetometer Traces

Proceedings of the 14th International Conference on Availability, Reliability and Security

We demonstrate a breach in smartphone location privacy through the accelerometer and magnetometer... more We demonstrate a breach in smartphone location privacy through the accelerometer and magnetometer's footprints. The merits or otherwise of explicitly permissioned location sensors are not the point of this paper. Instead, our proposition is that other non-locationsensitive sensors can track users accurately when the users are in motion, as in travelling on public transport, such as trains, buses, and taxis. Through field trials, we provide evidence that high accuracy location tracking can be achieved even via non-locationsensitive sensors for which no access authorisation is required from users on a smartphone.

Research paper thumbnail of Preface for the Proceedings of Machine Learning Research Volume 152

10th Symposium on Conformal and Probabilistic Prediction with Applications, Sep 10, 2021

Research paper thumbnail of A Glucose Sensor Based on Glucose Oxidase Immobilized by Electrospinning Nanofibrous Polymer Membranes Modified with Carbon Nanotubes

Sensors & Transducers, 2013

A glucose biosensor based on glucose oxidase immobilized by electrospinning nanofibrous membranes... more A glucose biosensor based on glucose oxidase immobilized by electrospinning nanofibrous membranes has been developed. Nanofibrous membranes were electrospun from the solution of poly(acrylonitrile-co-acrylic acid) containing carbon nanotubes suspension and directly deposited on Pt electrodes for immobilizing glucose oxidase. The morphologies and structure of the nanofibrous membranes with or without carbon nanotubes were characterized by scanning electron microscopy. The fabrication parameters of nanofibers were optimized such as thickness of the nanofibrous membranes and mass ration of carbon nanotubes. The biosensor showed the relationship with a concentration range of 0.1–10 mM and response time was 60 s. The sensitivity of carbon nanotubes modified biosensors was two times larger than which of no carbon nanotubes modified ones. The pH effect, interference and lifetime of biosensors were discussed.

Research paper thumbnail of Special Issue on Conformal and Probabilistic Prediction with Applications: Preface

Pattern Recognition, 2022

Research paper thumbnail of Twenty seconds of visual behaviour on social media gives insight into personality

Scientific Reports, 2022

Eye tracking allows the researcher to capture individual differences in the expression of visual ... more Eye tracking allows the researcher to capture individual differences in the expression of visual exploration behaviour, which in certain contexts has been found to reflect aspects of the user’s preferences and personality. In a novel approach, we recorded the eye movements of 180 participants whilst they browsed their Facebook News Feed and employed a machine learning approach to predict each of the self-reported Big Five personality traits from this viewing behaviour. We identify that specific visual behaviours are informative of an individual’s personality trait information, and can be used to psychologically profile social networking site users significantly better than chance after collecting only 20 seconds of viewing behaviour. We discuss potential applications for user engagement during human–computer interactions, and highlight potential privacy concerns.