Irene Y.H. Gu | Chalmers University of Technology (original) (raw)

Papers by Irene Y.H. Gu

This technical report describes the research work on automatic recognizing Chinese traffic signs ... more This technical report describes the research work on automatic recognizing Chinese traffic signs from an implicit public resource, i.e. street views. First, we give a comprehensive survey on Chinese traffic signs and introduce our approaches for collecting street view images that can be used for experimental purposes. Then, we introduce our coarse-to-fine recognition framework consisting of sign detection, sign salient region segmentation, feature extraction (including simple text recognition from signs), and subsequent sign classification. We also propose to incrementally build a sign dataset in a semi-automatic way, aiming at reducing manual effort. Experiments on collected datasets for both sign detection and classification have validated that the proposed framework is feasible and capable of recognizing multiple categories of Chinese traffic signs in a single input image.

Recently, tracking is regarded as a binary classification problem by discriminative tracking meth... more Recently, tracking is regarded as a binary classification problem by discriminative tracking methods. However, such binary classification may not fully handle the outliers, which may cause drifting. In this paper, we argue that tracking may be regarded as one-class problem, which avoids gathering limited negative samples for background description. Inspired by the fact the positive feature space generated by One-Class SVM is bounded by a closed sphere, we propose a novel tracking method utilizing One-Class SVMs that adopt HOG and 2bit-BP as features, called One-Class SVM Tracker (OCST). Simultaneously an efficient initialization and online updating scheme is also proposed. Extensive experimental results prove that OCST outperforms some state-of-the-art discriminative tracking methods on providing accurate tracking and alleviating serious drifting.

Energies, 2022

This paper proposes a machine-learning-based framework for voltage quality analytics, where the s... more This paper proposes a machine-learning-based framework for voltage quality analytics, where the space phasor model (SPM) of the three-phase voltages before, during, and after the event is applied as input data. The framework proceeds along with three main steps: (a) event extraction, (b) event characterization, and (c) additional information extraction. During the first step, it utilizes a Gaussian-based anomaly detection (GAD) technique to extract the event data from the recording. Principal component analysis (PCA) is adopted during the second step, where it is shown that the principal components correspond to the semi-minor and semi-major axis of the ellipse formed by the SPM. During the third step, these characteristics are interpreted to extract additional information about the underlying cause of the event. The performance of the framework was verified through experiments conducted on datasets containing synthetic and measured power quality events. The results show that the co...

This paper describes a new and highly efficient measurement method (algorithm) that determines ho... more This paper describes a new and highly efficient measurement method (algorithm) that determines how flicker propagates throughout the network and also traces the dominant flicker source. The fundamental principle of the method is to use the

In the context of diffusion tensor imaging (DTI), the utility of making repeated measurements in ... more In the context of diffusion tensor imaging (DTI), the utility of making repeated measurements in each diffusion sensitizing direction has been the subject of numerous studies. One can estimate the true signal value using either the raw complex-valued data or the real-valued magnitude signal. While conventional methods focus on the former strategy, this paper proposes a new framework for acquiring/processing repeated measurements based on the latter strategy. The aim is to enhance the DTI processing pipeline by adding a diffusion signal estimator (DSE). This permits us to exploit the knowledge of the noise distribution to estimate the true signal value in each direction. An extensive study of the proposed framework, including theoretical analysis, experiments with synthetic data, performance evaluation and comparisons is presented. Our results show that the precision of estimated diffusion parameters is dependent on the number of available samples and the manner in which the DSE acco...

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016

Central nervous system dysfunction in infants may be manifested through inconsistent, rigid and a... more Central nervous system dysfunction in infants may be manifested through inconsistent, rigid and abnormal limb movements. Detection of limb movement anomalies associated with such neurological dysfunctions in infants is the first step towards early treatment for improving infant development. This paper addresses the issue of detecting and quantifying limb movement anomalies in infants through non-invasive 3D image analysis methods using videos from multiple camera views. We propose a novel scheme for tracking 3D time trajectories of markers on infant's limbs by video analysis techniques. The proposed scheme employ videos captured from three camera views. This enables us to detect a set of enhanced 3D markers through cross-view matching and to effectively handle marker self-occlusions by other body parts. We track a set of 3D trajectories of limb movements by a set of particle filters in parallel, enabling more robust 3D tracking of markers, and use the 3D model errors for quantifying abrupt limb movements. The proposed work makes a significant advancement to the previous work in [1] through employing tracking in 3D space, and hence overcome several main barriers that hinder real applications by using single camera-based techniques. To the best of our knowledge, applying such a multi-view video analysis approach for assessing neurological dysfunctions of infants through 3D time trajectories of markers on limbs is novel, and could lead to computer-aided tools for diagnosis of dysfunctions where early treatment may improve infant development. Experiments were conducted on multi-view neonate videos recorded in a clinical setting and results have provided further support to the proposed method.

2018 25th IEEE International Conference on Image Processing (ICIP), 2018

2015 IEEE International Conference on Image Processing (ICIP), 2015

This paper addresses issues in fall detection from videos. The focus is on the analysis of human ... more This paper addresses issues in fall detection from videos. The focus is on the analysis of human shapes which deform drastically in camera views while a person falls onto the ground. A novel approach is proposed that performs fall detection from an arbitrary view angle, via shape analysis on a unified Riemannian manifold for different camera views. The main novelties of this paper include: (a) representing dynamic shapes as points moving on a unit n-sphere, one of the simplest Riemannian manifolds; (b) characterizing the deformation of shapes by computing velocity statistics of their corresponding manifold points, based on geodesic distances on the manifold. Experiments have been conducted on two publicly available video datasets for fall detection. Test, evaluations and comparisons with 6 existing methods show the effectiveness of our proposed method.

2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018

Automatic extraction of features from MRI brain scans and diagnosis of Alzheimer's Disease (A... more Automatic extraction of features from MRI brain scans and diagnosis of Alzheimer's Disease (AD) remain a challenging task. In this paper, we propose an efficient and simple three-dimensional convolutional network (3D ConvNet) architecture that is able to achieve high performance for detection of AD on a relatively large dataset. The proposed 3D ConvNet consists of five convolutional layers for feature extraction, followed by three fully-connected layers for AD/NC classification. The main contributions of the paper include: (a) propose a novel and effective 3D ConvNet architecture; (b) study the impact of hyper-parameter selection on the performance of AD classification; (c) study the impact of pre-processing; (d) study the impact of data partitioning; (e) study the impact of dataset size. Experiments conducted on an ADNI dataset containing 340 subjects and 1198 MRI brain scans have resulted good performance (with the test accuracy of 98.74%, 100% AD detection rate and 2,4% false alarm). Comparisons with 7 existing state-of-the-art methods have provided strong support to the robustness of the proposed method.

Neurocomputing, 2016

This paper addresses issues in fall detection in videos. We propose a novel method to detect huma... more This paper addresses issues in fall detection in videos. We propose a novel method to detect human falls from arbitrary view angles, through analyzing dynamic shape and motion of image regions of human bodies on Riemannian manifolds. The proposed method exploits time-dependent dynamic features on smooth manifolds based on the observation that human falls often involve drastically shape changes and abrupt motions as comparing with other activities. The main novelties of this paper include: (a) representing videos of human activities by dynamic shape points and motion points moving on two separate unit n-spheres, or, two simple Riemannian manifolds; (b) characterizing the dynamic shape and motion of each video activity by computing the velocity statistics on the two manifolds, based on geodesic distances; (c) combining the statistical features of dynamic shape and motion that are learned from their corresponding manifolds via mutual information. Experiments were conducted on three video datasets, containing 400 videos of 5 activities, 100 videos of 4 activities, and 768 videos of 3 activities, respectively, where videos were captured from cameras in different view angles. Our test results have shown high detection rate (average 99.38%) and low false alarm (average 1.84%). Comparisons with eight state-of-the-art methods have provided further support to the proposed method.

Computer Vision and Image Understanding, 2017

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service... more This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Highlights • A dedicated ADL classification system to assisted living and healthcare is proposed. • Several essential activities are handled, including ADL like eating and drinking, and anomalies like falls. • Part-based features on different Riemannian manifolds are used in a layered coarse-to-fine classification structure. • High classification accuracies and small false alarms are achieved by geodesic-induced kernel machines.

2015 17th International Conference on E-health Networking, Application & Services (HealthCom), 2015

This paper addresses issues in fall detection from videos. Since it has been a broadly accepted i... more This paper addresses issues in fall detection from videos. Since it has been a broadly accepted intuition that a falling person usually undergoes large physical movement and displacement in a short time interval, the study is thus focused on measuring the intensity and temporal variation of pose change and body motion. The main novelties of this paper include: (a) characterizing pose/motion dynamics based on centroid velocity, head-to-centroid distance, histogram of oriented gradients and optical flow; (b) extracting compact features based on the mean and variance of pose/motion dynamics; (c) detecting human by combining depth information and background mixture models. Experiments have been conducted on an RGB-D video dataset for fall detection. Tests and evaluations show the effectiveness of the proposed method.

2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), 2015

The icosahedral gradient encoding scheme (GES) is widely used in diffusion MRI community due to i... more The icosahedral gradient encoding scheme (GES) is widely used in diffusion MRI community due to its uniformly distributed orientations and rotationally invariant condition number. The major drawback with this scheme is that it is not available for arbitrary number of measurements. In this paper (i) we propose an algorithm to find the icosahedral scheme for any number of measurements. Performance of the obtained GES is evaluated and compared with that of Jones and traditional icosahedral schemes in terms of condition number, standard deviation of the estimated fractional anisotropy and distribution of diffusion sensitizing directions; and (ii) we introduce minimum eigenvalue of the information matrix as a new optimality metric to replace condition number. Unlike condition number, it is proportional to the number of measurements and thus in agreement with the intuition that more measurements leads to more robust tensor estimation. Furthermore, it may independently be maximized to design GESs for different diffusion imaging techniques.

Pattern Recognition, 2000

This paper presents a hybrid algorithm for coarse-to-"ne-matching of a$ne-invariant object featur... more This paper presents a hybrid algorithm for coarse-to-"ne-matching of a$ne-invariant object features and B-spline object curves, and simultaneous estimation of transformation parameters. For coarse-matching, two dissimilar measures are exploited by using the signi"cant corners of object boundaries to remove candidate objects with large dissimilarity to a target object. For "ne-matching, a robust point interpolation approach and a simple gradient-based algorithm are applied to B-spline object curves under MMSE criterion. The combination of coarse and "ne-matching steps reduces the computational cost without degrading the matching accuracy. The proposed algorithm is evaluated using a$ne transformed objects.

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016

The establishment of robust target appearance model over time is an overriding concern in visual ... more The establishment of robust target appearance model over time is an overriding concern in visual tracking. In this paper, we propose an inverse nonnegative matrix factorization (NMF) method for robust appearance modeling. Rather than using a linear combination of nonnegative basis vectors for each target image patch in conventional NMF, the proposed method is a reverse thought to conventional NMF tracker. It utilizes both the foreground and background information, and imposes a local coordinate constraint, where the basis matrix is sparse matrix from the linear combination of candidates with corresponding nonnegative coefficient vectors. Inverse NMF is used as a feature encoder, where the resulting coefficient vectors are fed into a SVM classifier for separating the target from the background. The proposed method is tested on several videos and compared with seven state-of-the-art methods. Our results have provided further support to the effectiveness and robustness of the proposed method.

Energies, 2021

This paper proposes a deep-learning-based method for frequency-dependent grid impedance estimatio... more This paper proposes a deep-learning-based method for frequency-dependent grid impedance estimation. Through measurement of voltages and currents at a specific system bus, the estimate of the grid impedance was obtained by first extracting the sequences of the time-dependent features for the measured data using a long short-term memory autoencoder (LSTM-AE) followed by a random forest (RF) regression method to find the nonlinear map function between extracted features and the corresponding grid impedance for a wide range of frequencies. The method was trained via simulation by using time-series measurements (i.e., voltage and current) for different system parameters and verified through several case studies. The obtained results show that: (1) extracting the time-dependent features of the voltage/current data improves the performance of the RF regression method; (2) the RF regression method is robust and allows grid impedance estimation within 1.5 grid cycles; (3) the proposed method...

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017

This paper addresses issues in human fall detection from videos. Unlike using handcrafted feature... more This paper addresses issues in human fall detection from videos. Unlike using handcrafted features in the conventional machine learning, we extract features from Convolutional Neural Networks (CNNs) for human fall detection. Similar to many existing work using two stream inputs, we use a spatial CNN stream with raw image difference and a temporal CNN stream with optical flow as the inputs of CNN. Different from conventional two stream action recognition work, we exploit sparse representation with residual-based pooling on the CNN extracted features, for obtaining more discriminative feature codes. For characterizing the sequential information in video activity, we use the code vector from long-range dynamic feature representation by concatenating codes in segment-levels as the input to a SVM classifier. Experiments have been conducted on two public video databases for fall detection. Comparisons with six existing methods show the effectiveness of the proposed method.

2019 IEEE International Conference on Image Processing (ICIP), 2019

This paper addresses the issues of Alzheimer’s disease (AD) characterization and detection from M... more This paper addresses the issues of Alzheimer’s disease (AD) characterization and detection from Magnetic Resonance Images (MRIs). Many existing AD detection methods use single-scale feature learning from brain scans. In this paper, we propose a multiscale deep learning architecture for learning AD features. The main contributions of the paper include: (a) propose a novel 3D multiscale CNN architecture for the dedicated task of AD detection; (b) propose a feature fusion and enhancement strategy for multiscale features; (c) empirical study on the impact of several settings, including two dataset partitioning approaches, and the use of multiscale and feature enhancement. Experiments were conducted on an open ADNI dataset (1198 brain scans from 337 subjects), test results have shown the effectiveness of the proposed method with test accuracy of 93.53%, 87.24% (best, average) on subject-separated dataset, and 99.44%, 98.80% (best, average) on random brain scan-partitioned dataset. Comparison with eight existing methods has provided further support to the proposed method.

Journal of Ambient Intelligence and Smart Environments, 2019

Ambient intelligence (AmI) is intrinsically and thoroughly connected with artificial intelligence... more Ambient intelligence (AmI) is intrinsically and thoroughly connected with artificial intelligence (AI). Some even say that it is, in essence, AI in the environment. AI, on the other hand, owes its success to the phenomenal development of the information and communication technologies (ICTs), based on principles such as Moore's law. In this paper we give an overview of the progress in AI and AmI interconnected with ICT through information-society laws, superintelligence, and several related disciplines, such as multi-agent systems and the Semantic Web, ambient assisted living and e-healthcare, AmI for assisting medical diagnosis, ambient intelligence for e-learning and ambient intelligence for smart cities. Besides a short history and a description of the current state, the frontiers and the future of AmI and AI are also considered in the paper.

Energies, 2022

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016

2018 25th IEEE International Conference on Image Processing (ICIP), 2018

2015 IEEE International Conference on Image Processing (ICIP), 2015

2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018

Neurocomputing, 2016

Computer Vision and Image Understanding, 2017

2015 17th International Conference on E-health Networking, Application & Services (HealthCom), 2015

2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), 2015

Pattern Recognition, 2000

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016

Energies, 2021

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017

2019 IEEE International Conference on Image Processing (ICIP), 2019

Journal of Ambient Intelligence and Smart Environments, 2019