Alistair Sutherland | Dublin City University (original) (raw)

Papers by Alistair Sutherland

Research paper thumbnail of A New Manifold Representation for Visual Speech Recognition

Lecture Notes in Computer Science, Aug 17, 2007

Research paper thumbnail of A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition

Springer eBooks, 2009

This paper presents the development of a novel visual speech recognition (VSR) system based on a ... more This paper presents the development of a novel visual speech recognition (VSR) system based on a new representation that extends the standard viseme concept (that is referred in this paper to as Visual Speech Unit (VSU)) and Hidden Markov Models (HMM). The visemes have been regarded as the smallest visual speech elements in the visual domain and they have been widely applied to model the visual speech, but it is worth noting that they are problematic when applied to the continuous visual speech recognition. To circumvent the problems associated with standard visemes, we propose a new visual speech representation that includes not only the data associated with the articulation of the visemes but also the transitory information between consecutive visemes. To fully evaluate the appropriateness of the proposed visual speech representation, in this paper an extensive set of experiments have been conducted to analyse the performance of the visual speech units when compared with that offered by the standard MPEG-4 visemes. The experimental results indicate that the developed VSR application achieved up to 90% correct recognition when the system has been applied to the identification of 60 classes of VSUs, while the recognition rate for the standard set of MPEG-4 visemes was only in the range 62-72%.

Research paper thumbnail of Disparity Estimation by Simultaneous Edge Drawing

Computer Vision – ACCV 2016 Workshops, 2017

This work presents a new low-level real-time algorithm for simultaneous edge drawing and disparit... more This work presents a new low-level real-time algorithm for simultaneous edge drawing and disparity calculation in stereo image pairs. It works by extending the principles from the ED algorithm, a fast and robust edge detector able to produce one pixel-wide chains of pixels for the edges in the image. In this paper the ED algorithm is extended to run simultaneously on both images in a stereo-pair. The disparity information is obtained by matching only a few anchor points and then propagating those disparities along the image edges. This allows the reduction of computational costs compared to other edge-based algorithms, as only a few pixels require to be matched, and avoids the problems present in other edge-point based approaches. The experiments show that this new approach is able to obtain accuracies similar to other state-of-the-art approaches but with a reduced number of computations.

Research paper thumbnail of A comparison between end-to-end approaches and feature extraction based approaches for Sign Language recognition

2017 International Conference on Image and Vision Computing New Zealand (IVCNZ), 2017

In this work we use a new image dataset for Irish Sign Language (ISL) and we compare different ap... more In this work we use a new image dataset for Irish Sign Language (ISL) and we compare different approaches for recognition. We perform experiments and report comparative accuracy and timing. We perform tests over blurred images and compare results with non-blurred images. For classification, we use end-to-end approach, such as Convolutional Neural Networks (CNN) and feature based extraction approaches, such as Principal Component Analysis (PCA) followed by different classifiers, i.e. multilayer perceptron (MLP). We obtain a recognition accuracy over 99% for both approaches. In addition, we report different ways to split the training and testing dataset, being one iterative and the other one random selected.

Research paper thumbnail of Practising Public Speaking: User Responses to using a Mirror versus a Multimodal Positive Computing System

Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2019

A multimodal Positive Computing system with real-time feedback for public speaking has been devel... more A multimodal Positive Computing system with real-time feedback for public speaking has been developed. The system uses the Microsoft Kinect to detect voice, body pose, facial expressions and gestures. The system is a real-time system, which gives users feedback on their performance while they are rehearsing a speech. In this study, we wished to compare this system with a traditional method for practising speaking, namely using a mirror. Ten participants practised a speech for sixty seconds using the system and using the mirror. They completed surveys on their experience after each practice session. Data about their performance was recorded while they were speaking. We found that participants found the system less stressful to use than using the mirror. Participants also reported that they were more motivated to use the system in future. We also found that the system made speakers more aware of their body pose, gaze direction and voice.

Research paper thumbnail of High Fidelity Synthetic Face Generation for Rosacea Skin Condition from Limited Data

Similar to the majority of deep learning applications, diagnosing skin diseases using computer vi... more Similar to the majority of deep learning applications, diagnosing skin diseases using computer vision and deep learning often requires a large volume of data. However, obtaining sufficient data for particular types of facial skin conditions can be difficult due to privacy concerns. As a result, conditions like Rosacea are often understudied in computer-aided diagnosis. The limited availability of data for facial skin conditions has led to the investigation of alternative methods for computer-aided diagnosis. In recent years, Generative Adversarial Networks (GANs), mainly variants of StyleGANs, have demonstrated promising results in generating synthetic facial images. In this study, for the first time, a small dataset of Rosacea with 300 full-face images is utilized to further investigate the possibility of generating synthetic data. The preliminary experiments show how fine-tuning the model and varying experimental settings significantly affect the fidelity of the Rosacea features. ...

Research paper thumbnail of Towards Synthetic Generation of Clinical Rosacea Images with GAN Models

2022 33rd Irish Signals and Systems Conference (ISSC)

Computer-aided skin disease diagnosis has recently attracted much attention in the scientific and... more Computer-aided skin disease diagnosis has recently attracted much attention in the scientific and medical research community due to advances in computer vision and machine learning algorithms. These methodologies essentially rely on large datasets collected from hospitals and medical professionals. Data scarcity is a vital problem in the medical domain, especially facial skin conditions, due to privacy concerns. For instance, some facial skin conditions, e.g. Rosacea, require observation of the entire face, which reveals the patient's identity. Rosacea is a lamentably neglected skin condition in the computer-aided diagnosis research community, due to the limited availability of Rosacea datasets. Hence, there is a need for exploring alternative ways to deal with the limited available data for Rosacea. A common approach to expanding small datasets is to utilise augmentation techniques. One of the most powerful augmentation methods in machine learning is Generative Adversarial Networks (GANs). Recently, GANs, principally the variants of StyleGAN, have successfully generated synthetic facial images. In this paper, a small dataset of a particular skin disease, Rosacea, with 300 images is used to examine the potential of a variant of StyleGAN known as StyleGAN2-ADA. The preliminary experiments and evaluations show promising signs towards addressing the data scarcity for computer-aided Rosacea diagnosis.

Research paper thumbnail of A Dataset for Irish sign language recognition

We introduce a new image dataset for Irish Sign Language (ISL) recognition. We filmed human subje... more We introduce a new image dataset for Irish Sign Language (ISL) recognition. We filmed human subjects performing ISL hand-shapes and movements, resulting in 468 videos. Then, we extracted frames from the videos. This resulted in a total of 58,114 images for the 23 common hand-shapes from the ISL language. This dataset is a part of our ongoing work on ISL recognition using pattern recognition methods. In addition to the dataset, we report experiments using Principal Component Analysis (PCA) where we reached 95% recognition accuracy.

Research paper thumbnail of A Novel Approach for Occluded Ear Recognition Based on Shape Context

2018 International Conference on Advanced Science and Engineering (ICOASE), 2018

The amount of digitized application is growing fast and continuously. As the result of such growt... more The amount of digitized application is growing fast and continuously. As the result of such growth, professional, reliable and secure techniques for identifying people inside both real and virtual worlds are mandatory. In this paper, we present a fully automatic ear-based biometric system which needs no human intervention and can be used in a real-time manner. The proposed system aims to recognize people based on their ear shape extracted from a profile facial image which usually suffers from partial occlusion caused by hair and/or earrings. First, a cascaded classifier-based ear detection approach that uses Haar-like features is used to detect ears in profile images. Later, the process is followed by a novel ear recognition technique based on Shape Context descriptor. The results of testing the proposed approach on some of the standard datasets show promising results; for non-occluded images 100% recognition achieved while for the images where the ear was occluded by both hair and earring, the accuracy was 57%.

Research paper thumbnail of A Robust 3D Shape Descriptor based on the Electrical Charge Distribution

Proceedings of the International Conference on Computer Vision Theory and Applications, 2013

Defining a robust shape descriptor is an enormous challenge in the 3D model retrieval domain. The... more Defining a robust shape descriptor is an enormous challenge in the 3D model retrieval domain. Therefore, great deals of research have been conducted to propose new shape descriptors which meet the retrieving criteria. This paper proposes a new shape descriptor based on the distribution of electrical charge which holds valuable characteristics such as insensitivity to translation, sale and rotation, robustness to noise as well as simplification operation. After extracting the canonical form representation of the models, they are treated as surfaces placed in a free space and charge Q is distributed over them. Following to calculating the amount of charge on each face of the model, a set of concentric spheres enclose the model and the total amount of distributed charge between the adjacent spheres on the model's surface generates the Charge Distribution Descriptor (CDD). A beneficial two-phase description using the number of Charged-Dense Patches for each model is utilized to boost the discrimination power of the system. The strength of our approach is verified using experiments on the McGill dataset. The results demonstrate higher ability of our system compared to other well-known approaches. 2 RELATED WORKS Research on the 3D model retrieval started less than 213 Alizadeh F. and Sutherland A.. A Robust 3D Shape Descriptor based on the Electrical Charge Distribution.

Research paper thumbnail of Manifold Interpolation for an Efficient Hand Shape Recognition in the Irish Sign Language

Advances in Visual Computing, 2016

This paper presents interpolation using two-stage PCA for hand shape recognition. In the first st... more This paper presents interpolation using two-stage PCA for hand shape recognition. In the first stage PCA is performed on the entire training dataset of real human hand images. In the second stage, on separate sub-sets of the projected points in the first-stage eigenspace. The training set contains only a few pose angles. The output is a set of new interpolated manifolds, representing the missing data. The goal of this approach is to create a more robust dataset, able to recognise a hand image from an unknown rotation. We show some accuracy values in recognising unknown hand shapes.

Research paper thumbnail of Automatic Skin Segmentation for Gesture Recognition Combining Region and Support Vector Machine Active Learning

IEEE International Conference on Automatic Face and Gesture Recognition, 2006

Skin segmentation is the cornerstone of many applications such as gesture recognition, face detec... more Skin segmentation is the cornerstone of many applications such as gesture recognition, face detection, and objectionable image filtering. In this paper, we attempt to address the skin segmentation problem for gesture recognition. Initially, given a gesture video sequence, a generic skin model is applied to the first couple of frames to automatically collect the training data. Then, an SVM classifier

Research paper thumbnail of A Unified System for Segmentation and Tracking of Face and Hands in Sign Language Recognition

18th International Conference on Pattern Recognition (ICPR'06), 2006

This paper presents a unified system for segmentation and tracking of face and hands in a sign la... more This paper presents a unified system for segmentation and tracking of face and hands in a sign language recognition using a single camera. Unlike much related work that uses colour gloves, we detect skin by combining 3 useful features: colour, motion and position. These features together, represent the skin colour pixels that are more likely to be foreground pixels and are within a predicted position range. We extend the previous research in occlusion detection to handle occlusion between any of the skin objects using a Kalman filter based algorithm. The tracking improves the segmentation by reducing the search space and the segmentation enhances the overall tracking process. The algorithm is tested on several video sequences from a standard database and can provide a very low error rate.

Research paper thumbnail of Skin Disease Analysis With Limited Data in Particular Rosacea: A Review and Recommended Framework

IEEE Access

Recently, the rapid advancements in Deep Learning and Computer Vision technologies have introduce... more Recently, the rapid advancements in Deep Learning and Computer Vision technologies have introduced a new and exciting era in the field of skin disease analysis. However, there are certain challenges in the roadmap towards developing such technologies for real-life applications that must be investigated. This study considers one of the key challenges in data acquisition and computation, viz. data scarcity. Data scarcity is a central problem in acquiring medical images and applying machine learning techniques to train Convolutional Neural Networks for disease diagnosis. The main objective of this study is to explore the possible methods to deal with the data scarcity problem and to improve diagnosis with small datasets. The challenges in data acquisition for a few lamentably neglected skin conditions such as rosacea are an excellent instance to explore the possibilities of improving computer-aided skin disease diagnosis. With data scarcity in mind, the possible techniques explored and discussed include Generative Adversarial Networks, Meta-Learning, Few-Shot classification, and 3D face modelling. Furthermore, the existing studies are discussed based on skin conditions considered, data volume and implementation choices. Some future research directions are recommended.

Research paper thumbnail of Two-stage PCA with interpolated data for hand shape recognition in sign language

2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), 2016

Hand shape recognition is a challenging task because hands are deformable objects. Some technique... more Hand shape recognition is a challenging task because hands are deformable objects. Some techniques for hand shape recognition using Computer Vision have been proposed. The key problem is how to make hand gestures understood by computers/mobile devices. In this paper we present a study about Principal Component Analysis (PCA) used to reduce the dimensionality and extract features of images of the human hand. The dataset used in this study is the alphabet of Irish Sign Language. We propose to apply PCA in more than one stage, creating a second stage PCA with even lower dimensions. In this second stage, we interpolate data using splines. This data has missing translations. Blurring, using a Gaussian filter, is applied to these images in order to reduce the non-linearity in the manifolds within the eigenspaces. Some comparison of the influence of the number of eigenvectors and the number of points interpolated are shown. Finally, we apply k-Nearest- Neighbour (k-NN) in order to classify the correct shape and show the accuracy.

Research paper thumbnail of Educational video classification by using a transcript to image transform and supervised learning

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), 2017

In this work, we present a method for automatic topic classification of educational videos using ... more In this work, we present a method for automatic topic classification of educational videos using a speech transcript transform. Our method works as follows: First, speech recognition is used to generate video transcripts. Then, the transcripts are converted into images using a statistical cooccurrence transformation that we designed. Finally, a classifier is used to produce video category labels for a transcript image input. For our classifiers, we report results using a convolutional neural network (CNN) and a principal component analysis (PCA) model. In order to evaluate our method, we used the Khan Academy on a Stick dataset that contains 2,545 videos, where each video is labeled with one or two of 13 categories. Experiments show that our method is effective and strongly competitive against other supervised learning-based methods.

Research paper thumbnail of Multimodal system for public speaking with real time feedback: a positive computing perspective

Proceedings of the 18th ACM International Conference on Multimodal Interaction - ICMI 2016, 2016

A multimodal system for public speaking with real time feedback has been developed using the Micr... more A multimodal system for public speaking with real time feedback has been developed using the Microsoft Kinect. The system has been developed within the paradigm of positive computing which focuses on designing for user wellbeing. The system detects body pose, facial expressions and voice. Visual feedback is displayed to users on their speaking performance in real time. Users can view statistics on their utilisation of speaking modalities. The system also has a mentor avatar which appears alongside the user avatar to facilitate user training. Autocue mode allows a user to practice with set text from a chosen speech.

Research paper thumbnail of A combined two-stage local-spatial interest point matching algorithm

2013 8th Iranian Conference on Machine Vision and Image Processing (MVIP), 2013

A local-spatial interest point matching algorithm for articulated human upper body tracking appli... more A local-spatial interest point matching algorithm for articulated human upper body tracking application is proposed in this paper. The first stage finds confidently matched pairs of interest points from the reference and target interest point lists through a local-feature-descriptors-based matching method. Applying two cross-checking and displacement-checking steps reduces the number of mismatched pairs and results confidently matched pairs. Using these confidently matched pairs, the second stage recovers more matched interest point pairs from the remaining unmatched through the graph matching by a cyclic string matching algorithm. The proposed approach benefits from the speed of local matching algorithms as well as the accuracy and robustness of spatial matching methods. In addition, it compensates for the reference list leakage problem. Experimental results show that the combined two-stage interest matching method efficiently improves the matching process for articulated human upper body tracking.

Research paper thumbnail of An improved interest point matching algorithm for human body tracking

International Image Processing, Applications and Systems Conference, 2014

We report a rare case of posterior interosseous nerve (PIN) paralysis in a tennis player. The PIN... more We report a rare case of posterior interosseous nerve (PIN) paralysis in a tennis player. The PIN, a 2 cm section from a bifurcation point of the radial nerve, presented increased stiffness in the surgical findings and treated with free sural nerve grafting after excision of the degenerative portion of the PIN. We speculate that PIN paralysis associated with hourglass-like constriction can be caused and exacerbated by repetitive forearm pronation and supination in playing tennis.

Research paper thumbnail of Accurate Recognition of Large Number of Hand Gestures

A hierarchical gesture recognition algorithm is introduced to recognise a large number of gesture... more A hierarchical gesture recognition algorithm is introduced to recognise a large number of gestures. Three stages of the proposed algorithm are based on a new hand tracking technique to recognise the actual beginning of a gesture using a Kalman filtering process, hidden Markov models and graph matching. Processing time is important in working with large databases. Therefore, special cares are taken to deal with the large number of gestures, which are partially similar.

Research paper thumbnail of A New Manifold Representation for Visual Speech Recognition

Lecture Notes in Computer Science, Aug 17, 2007

Research paper thumbnail of A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition

Springer eBooks, 2009

This paper presents the development of a novel visual speech recognition (VSR) system based on a ... more This paper presents the development of a novel visual speech recognition (VSR) system based on a new representation that extends the standard viseme concept (that is referred in this paper to as Visual Speech Unit (VSU)) and Hidden Markov Models (HMM). The visemes have been regarded as the smallest visual speech elements in the visual domain and they have been widely applied to model the visual speech, but it is worth noting that they are problematic when applied to the continuous visual speech recognition. To circumvent the problems associated with standard visemes, we propose a new visual speech representation that includes not only the data associated with the articulation of the visemes but also the transitory information between consecutive visemes. To fully evaluate the appropriateness of the proposed visual speech representation, in this paper an extensive set of experiments have been conducted to analyse the performance of the visual speech units when compared with that offered by the standard MPEG-4 visemes. The experimental results indicate that the developed VSR application achieved up to 90% correct recognition when the system has been applied to the identification of 60 classes of VSUs, while the recognition rate for the standard set of MPEG-4 visemes was only in the range 62-72%.

Research paper thumbnail of Disparity Estimation by Simultaneous Edge Drawing

Computer Vision – ACCV 2016 Workshops, 2017

This work presents a new low-level real-time algorithm for simultaneous edge drawing and disparit... more This work presents a new low-level real-time algorithm for simultaneous edge drawing and disparity calculation in stereo image pairs. It works by extending the principles from the ED algorithm, a fast and robust edge detector able to produce one pixel-wide chains of pixels for the edges in the image. In this paper the ED algorithm is extended to run simultaneously on both images in a stereo-pair. The disparity information is obtained by matching only a few anchor points and then propagating those disparities along the image edges. This allows the reduction of computational costs compared to other edge-based algorithms, as only a few pixels require to be matched, and avoids the problems present in other edge-point based approaches. The experiments show that this new approach is able to obtain accuracies similar to other state-of-the-art approaches but with a reduced number of computations.

Research paper thumbnail of A comparison between end-to-end approaches and feature extraction based approaches for Sign Language recognition

2017 International Conference on Image and Vision Computing New Zealand (IVCNZ), 2017

In this work we use a new image dataset for Irish Sign Language (ISL) and we compare different ap... more In this work we use a new image dataset for Irish Sign Language (ISL) and we compare different approaches for recognition. We perform experiments and report comparative accuracy and timing. We perform tests over blurred images and compare results with non-blurred images. For classification, we use end-to-end approach, such as Convolutional Neural Networks (CNN) and feature based extraction approaches, such as Principal Component Analysis (PCA) followed by different classifiers, i.e. multilayer perceptron (MLP). We obtain a recognition accuracy over 99% for both approaches. In addition, we report different ways to split the training and testing dataset, being one iterative and the other one random selected.

Research paper thumbnail of Practising Public Speaking: User Responses to using a Mirror versus a Multimodal Positive Computing System

Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2019

A multimodal Positive Computing system with real-time feedback for public speaking has been devel... more A multimodal Positive Computing system with real-time feedback for public speaking has been developed. The system uses the Microsoft Kinect to detect voice, body pose, facial expressions and gestures. The system is a real-time system, which gives users feedback on their performance while they are rehearsing a speech. In this study, we wished to compare this system with a traditional method for practising speaking, namely using a mirror. Ten participants practised a speech for sixty seconds using the system and using the mirror. They completed surveys on their experience after each practice session. Data about their performance was recorded while they were speaking. We found that participants found the system less stressful to use than using the mirror. Participants also reported that they were more motivated to use the system in future. We also found that the system made speakers more aware of their body pose, gaze direction and voice.

Research paper thumbnail of High Fidelity Synthetic Face Generation for Rosacea Skin Condition from Limited Data

Similar to the majority of deep learning applications, diagnosing skin diseases using computer vi... more Similar to the majority of deep learning applications, diagnosing skin diseases using computer vision and deep learning often requires a large volume of data. However, obtaining sufficient data for particular types of facial skin conditions can be difficult due to privacy concerns. As a result, conditions like Rosacea are often understudied in computer-aided diagnosis. The limited availability of data for facial skin conditions has led to the investigation of alternative methods for computer-aided diagnosis. In recent years, Generative Adversarial Networks (GANs), mainly variants of StyleGANs, have demonstrated promising results in generating synthetic facial images. In this study, for the first time, a small dataset of Rosacea with 300 full-face images is utilized to further investigate the possibility of generating synthetic data. The preliminary experiments show how fine-tuning the model and varying experimental settings significantly affect the fidelity of the Rosacea features. ...

Research paper thumbnail of Towards Synthetic Generation of Clinical Rosacea Images with GAN Models

2022 33rd Irish Signals and Systems Conference (ISSC)

Computer-aided skin disease diagnosis has recently attracted much attention in the scientific and... more Computer-aided skin disease diagnosis has recently attracted much attention in the scientific and medical research community due to advances in computer vision and machine learning algorithms. These methodologies essentially rely on large datasets collected from hospitals and medical professionals. Data scarcity is a vital problem in the medical domain, especially facial skin conditions, due to privacy concerns. For instance, some facial skin conditions, e.g. Rosacea, require observation of the entire face, which reveals the patient's identity. Rosacea is a lamentably neglected skin condition in the computer-aided diagnosis research community, due to the limited availability of Rosacea datasets. Hence, there is a need for exploring alternative ways to deal with the limited available data for Rosacea. A common approach to expanding small datasets is to utilise augmentation techniques. One of the most powerful augmentation methods in machine learning is Generative Adversarial Networks (GANs). Recently, GANs, principally the variants of StyleGAN, have successfully generated synthetic facial images. In this paper, a small dataset of a particular skin disease, Rosacea, with 300 images is used to examine the potential of a variant of StyleGAN known as StyleGAN2-ADA. The preliminary experiments and evaluations show promising signs towards addressing the data scarcity for computer-aided Rosacea diagnosis.

Research paper thumbnail of A Dataset for Irish sign language recognition

We introduce a new image dataset for Irish Sign Language (ISL) recognition. We filmed human subje... more We introduce a new image dataset for Irish Sign Language (ISL) recognition. We filmed human subjects performing ISL hand-shapes and movements, resulting in 468 videos. Then, we extracted frames from the videos. This resulted in a total of 58,114 images for the 23 common hand-shapes from the ISL language. This dataset is a part of our ongoing work on ISL recognition using pattern recognition methods. In addition to the dataset, we report experiments using Principal Component Analysis (PCA) where we reached 95% recognition accuracy.

Research paper thumbnail of A Novel Approach for Occluded Ear Recognition Based on Shape Context

2018 International Conference on Advanced Science and Engineering (ICOASE), 2018

The amount of digitized application is growing fast and continuously. As the result of such growt... more The amount of digitized application is growing fast and continuously. As the result of such growth, professional, reliable and secure techniques for identifying people inside both real and virtual worlds are mandatory. In this paper, we present a fully automatic ear-based biometric system which needs no human intervention and can be used in a real-time manner. The proposed system aims to recognize people based on their ear shape extracted from a profile facial image which usually suffers from partial occlusion caused by hair and/or earrings. First, a cascaded classifier-based ear detection approach that uses Haar-like features is used to detect ears in profile images. Later, the process is followed by a novel ear recognition technique based on Shape Context descriptor. The results of testing the proposed approach on some of the standard datasets show promising results; for non-occluded images 100% recognition achieved while for the images where the ear was occluded by both hair and earring, the accuracy was 57%.

Research paper thumbnail of A Robust 3D Shape Descriptor based on the Electrical Charge Distribution

Proceedings of the International Conference on Computer Vision Theory and Applications, 2013

Defining a robust shape descriptor is an enormous challenge in the 3D model retrieval domain. The... more Defining a robust shape descriptor is an enormous challenge in the 3D model retrieval domain. Therefore, great deals of research have been conducted to propose new shape descriptors which meet the retrieving criteria. This paper proposes a new shape descriptor based on the distribution of electrical charge which holds valuable characteristics such as insensitivity to translation, sale and rotation, robustness to noise as well as simplification operation. After extracting the canonical form representation of the models, they are treated as surfaces placed in a free space and charge Q is distributed over them. Following to calculating the amount of charge on each face of the model, a set of concentric spheres enclose the model and the total amount of distributed charge between the adjacent spheres on the model's surface generates the Charge Distribution Descriptor (CDD). A beneficial two-phase description using the number of Charged-Dense Patches for each model is utilized to boost the discrimination power of the system. The strength of our approach is verified using experiments on the McGill dataset. The results demonstrate higher ability of our system compared to other well-known approaches. 2 RELATED WORKS Research on the 3D model retrieval started less than 213 Alizadeh F. and Sutherland A.. A Robust 3D Shape Descriptor based on the Electrical Charge Distribution.

Research paper thumbnail of Manifold Interpolation for an Efficient Hand Shape Recognition in the Irish Sign Language

Advances in Visual Computing, 2016

This paper presents interpolation using two-stage PCA for hand shape recognition. In the first st... more This paper presents interpolation using two-stage PCA for hand shape recognition. In the first stage PCA is performed on the entire training dataset of real human hand images. In the second stage, on separate sub-sets of the projected points in the first-stage eigenspace. The training set contains only a few pose angles. The output is a set of new interpolated manifolds, representing the missing data. The goal of this approach is to create a more robust dataset, able to recognise a hand image from an unknown rotation. We show some accuracy values in recognising unknown hand shapes.

Research paper thumbnail of Automatic Skin Segmentation for Gesture Recognition Combining Region and Support Vector Machine Active Learning

IEEE International Conference on Automatic Face and Gesture Recognition, 2006

Skin segmentation is the cornerstone of many applications such as gesture recognition, face detec... more Skin segmentation is the cornerstone of many applications such as gesture recognition, face detection, and objectionable image filtering. In this paper, we attempt to address the skin segmentation problem for gesture recognition. Initially, given a gesture video sequence, a generic skin model is applied to the first couple of frames to automatically collect the training data. Then, an SVM classifier

Research paper thumbnail of A Unified System for Segmentation and Tracking of Face and Hands in Sign Language Recognition

18th International Conference on Pattern Recognition (ICPR'06), 2006

This paper presents a unified system for segmentation and tracking of face and hands in a sign la... more This paper presents a unified system for segmentation and tracking of face and hands in a sign language recognition using a single camera. Unlike much related work that uses colour gloves, we detect skin by combining 3 useful features: colour, motion and position. These features together, represent the skin colour pixels that are more likely to be foreground pixels and are within a predicted position range. We extend the previous research in occlusion detection to handle occlusion between any of the skin objects using a Kalman filter based algorithm. The tracking improves the segmentation by reducing the search space and the segmentation enhances the overall tracking process. The algorithm is tested on several video sequences from a standard database and can provide a very low error rate.

Research paper thumbnail of Skin Disease Analysis With Limited Data in Particular Rosacea: A Review and Recommended Framework

IEEE Access

Recently, the rapid advancements in Deep Learning and Computer Vision technologies have introduce... more Recently, the rapid advancements in Deep Learning and Computer Vision technologies have introduced a new and exciting era in the field of skin disease analysis. However, there are certain challenges in the roadmap towards developing such technologies for real-life applications that must be investigated. This study considers one of the key challenges in data acquisition and computation, viz. data scarcity. Data scarcity is a central problem in acquiring medical images and applying machine learning techniques to train Convolutional Neural Networks for disease diagnosis. The main objective of this study is to explore the possible methods to deal with the data scarcity problem and to improve diagnosis with small datasets. The challenges in data acquisition for a few lamentably neglected skin conditions such as rosacea are an excellent instance to explore the possibilities of improving computer-aided skin disease diagnosis. With data scarcity in mind, the possible techniques explored and discussed include Generative Adversarial Networks, Meta-Learning, Few-Shot classification, and 3D face modelling. Furthermore, the existing studies are discussed based on skin conditions considered, data volume and implementation choices. Some future research directions are recommended.

Research paper thumbnail of Two-stage PCA with interpolated data for hand shape recognition in sign language

2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), 2016

Hand shape recognition is a challenging task because hands are deformable objects. Some technique... more Hand shape recognition is a challenging task because hands are deformable objects. Some techniques for hand shape recognition using Computer Vision have been proposed. The key problem is how to make hand gestures understood by computers/mobile devices. In this paper we present a study about Principal Component Analysis (PCA) used to reduce the dimensionality and extract features of images of the human hand. The dataset used in this study is the alphabet of Irish Sign Language. We propose to apply PCA in more than one stage, creating a second stage PCA with even lower dimensions. In this second stage, we interpolate data using splines. This data has missing translations. Blurring, using a Gaussian filter, is applied to these images in order to reduce the non-linearity in the manifolds within the eigenspaces. Some comparison of the influence of the number of eigenvectors and the number of points interpolated are shown. Finally, we apply k-Nearest- Neighbour (k-NN) in order to classify the correct shape and show the accuracy.

Research paper thumbnail of Educational video classification by using a transcript to image transform and supervised learning

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), 2017

In this work, we present a method for automatic topic classification of educational videos using ... more In this work, we present a method for automatic topic classification of educational videos using a speech transcript transform. Our method works as follows: First, speech recognition is used to generate video transcripts. Then, the transcripts are converted into images using a statistical cooccurrence transformation that we designed. Finally, a classifier is used to produce video category labels for a transcript image input. For our classifiers, we report results using a convolutional neural network (CNN) and a principal component analysis (PCA) model. In order to evaluate our method, we used the Khan Academy on a Stick dataset that contains 2,545 videos, where each video is labeled with one or two of 13 categories. Experiments show that our method is effective and strongly competitive against other supervised learning-based methods.

Research paper thumbnail of Multimodal system for public speaking with real time feedback: a positive computing perspective

Proceedings of the 18th ACM International Conference on Multimodal Interaction - ICMI 2016, 2016

A multimodal system for public speaking with real time feedback has been developed using the Micr... more A multimodal system for public speaking with real time feedback has been developed using the Microsoft Kinect. The system has been developed within the paradigm of positive computing which focuses on designing for user wellbeing. The system detects body pose, facial expressions and voice. Visual feedback is displayed to users on their speaking performance in real time. Users can view statistics on their utilisation of speaking modalities. The system also has a mentor avatar which appears alongside the user avatar to facilitate user training. Autocue mode allows a user to practice with set text from a chosen speech.

Research paper thumbnail of A combined two-stage local-spatial interest point matching algorithm

2013 8th Iranian Conference on Machine Vision and Image Processing (MVIP), 2013

A local-spatial interest point matching algorithm for articulated human upper body tracking appli... more A local-spatial interest point matching algorithm for articulated human upper body tracking application is proposed in this paper. The first stage finds confidently matched pairs of interest points from the reference and target interest point lists through a local-feature-descriptors-based matching method. Applying two cross-checking and displacement-checking steps reduces the number of mismatched pairs and results confidently matched pairs. Using these confidently matched pairs, the second stage recovers more matched interest point pairs from the remaining unmatched through the graph matching by a cyclic string matching algorithm. The proposed approach benefits from the speed of local matching algorithms as well as the accuracy and robustness of spatial matching methods. In addition, it compensates for the reference list leakage problem. Experimental results show that the combined two-stage interest matching method efficiently improves the matching process for articulated human upper body tracking.

Research paper thumbnail of An improved interest point matching algorithm for human body tracking

International Image Processing, Applications and Systems Conference, 2014

We report a rare case of posterior interosseous nerve (PIN) paralysis in a tennis player. The PIN... more We report a rare case of posterior interosseous nerve (PIN) paralysis in a tennis player. The PIN, a 2 cm section from a bifurcation point of the radial nerve, presented increased stiffness in the surgical findings and treated with free sural nerve grafting after excision of the degenerative portion of the PIN. We speculate that PIN paralysis associated with hourglass-like constriction can be caused and exacerbated by repetitive forearm pronation and supination in playing tennis.

Research paper thumbnail of Accurate Recognition of Large Number of Hand Gestures

A hierarchical gesture recognition algorithm is introduced to recognise a large number of gesture... more A hierarchical gesture recognition algorithm is introduced to recognise a large number of gestures. Three stages of the proposed algorithm are based on a new hand tracking technique to recognise the actual beginning of a gesture using a Kalman filtering process, hidden Markov models and graph matching. Processing time is important in working with large databases. Therefore, special cares are taken to deal with the large number of gestures, which are partially similar.