Jeba Berlin - Academia.edu (original) (raw)

Papers by Jeba Berlin

Research paper thumbnail of Detecting A Child’s Stimming Behaviours for Autism Spectrum Disorder Diagnosis using Rgbpose-Slowfast Network

2022 IEEE International Conference on Image Processing (ICIP), Oct 16, 2022

Autism Spectrum Disoder (ASD) is a neurodevelopmental disorder characterized by (a) persistent de... more Autism Spectrum Disoder (ASD) is a neurodevelopmental disorder characterized by (a) persistent deficits in social communication and interaction, and (b) presence of restrictive, repetitive patterns of behaviours, interests or activities. The stereotyped repetitive behaviours are also referred to as stimming behaviours. We propose a deep learning based approach to automatically predict a child's stimming behaviours from videos recorded in unconstrained conditions. The child's region in the video is tracked and its skeletal joints are derived using the pose estimator. The heatmap representation of skeletal joints and the raw video signals are used as inputs to the two pathways of the RGBPose-SlowFast deep network to model stimming behaviours. The proposed model is evaluated using the publicly available Self-Stimulatory Behaviour Dataset (SSBD) of stimming behaviours. The generalization ability of the model is validated using the Autism dataset containing child's motor actions. Our experiments demonstrate state-of-the-art results on both datasets.

Research paper thumbnail of Detecting A Child’s Stimming Behaviours for Autism Spectrum Disorder Diagnosis using Rgbpose-Slowfast Network

2022 IEEE International Conference on Image Processing (ICIP)

Autism Spectrum Disoder (ASD) is a neurodevelopmental disorder characterized by (a) persistent de... more Autism Spectrum Disoder (ASD) is a neurodevelopmental disorder characterized by (a) persistent deficits in social communication and interaction, and (b) presence of restrictive, repetitive patterns of behaviours, interests or activities. The stereotyped repetitive behaviours are also referred to as stimming behaviours. We propose a deep learning based approach to automatically predict a child's stimming behaviours from videos recorded in unconstrained conditions. The child's region in the video is tracked and its skeletal joints are derived using the pose estimator. The heatmap representation of skeletal joints and the raw video signals are used as inputs to the two pathways of the RGBPose-SlowFast deep network to model stimming behaviours. The proposed model is evaluated using the publicly available Self-Stimulatory Behaviour Dataset (SSBD) of stimming behaviours. The generalization ability of the model is validated using the Autism dataset containing child's motor actions. Our experiments demonstrate state-of-the-art results on both datasets.

Research paper thumbnail of R-STDP Based Spiking Neural Network for Human Action Recognition

Applied Artificial Intelligence

Video surveillance systems are omnipresent and automatic monitoring of human activities is gainin... more Video surveillance systems are omnipresent and automatic monitoring of human activities is gaining importance in highly secured environments. The proposed work explores the use of the bioinspired third generation neural network called spiking neural network (SNN) in order to recognize the action sequences present in a video. The SNN used in this work carries the neural information in terms of timing of spikes rather than the shape of the spikes. The learning technique used herein is reward-modulated spike timedependent plasticity (R-STDP). It is based on reinforcement learning that modulates or demodulates the synaptic weights depending on the reward or the punishment signal that it receives from the decision layer. The absence of gradient descent techniques and external classifiers makes the system computationally efficient and simple. Finally, the performance of the network is evaluated on the two benchmark datasets, viz., Weizmann and KTH datasets.

Research paper thumbnail of Light weight convolutional models with spiking neural network based human action recognition

Journal of Intelligent & Fuzzy Systems, 2020

Though deep learning networks have proven ability to perform video analytics in complex environme... more Though deep learning networks have proven ability to perform video analytics in complex environments, there is an increased attention towards the development of compact networks which would facilitate edge processing and the result of which have yielded high performance compressed deep learning networks such as, MobileNet, PWCNet and BindsNet. In the work proposed herein, a dual network configuration is used for human action recognition, wherein, the MobileNet captures the spatial appearance of the action sequences and the PWCNet is used to extract the motion vectors. A novel Spiking Neural Network (SNN) based configuration is used as the classifier and the SNN implementation is based on BindsNet. The proposed configuration is experimentally validated on challenging datasets, viz., HMDB51 and UCF101. The experimental results demonstrate that the proposed work is superior to the state-of-the-art techniques and comparable in few cases.

Research paper thumbnail of Vision based human fall detection with Siamese convolutional neural networks

Journal of Ambient Intelligence and Humanized Computing, 2021

Fall detection is drawing serious attention all across the globe, as unattended fall of senior ci... more Fall detection is drawing serious attention all across the globe, as unattended fall of senior citizens creates long lasting injuries. This necessitates the deployment of automatic fall detection systems to facilitate smart care health environments for the elderly people living in various settings, viz., living independently in their homes, hospitalized or living in care homes. The proposed work employs Siamese network with one shot classification for human fall detection. Unlike the neural network that classifies the video sequences, this network learns to differentiate the video sequences by computing the similarity score. The network contains two identical CNNs, receiving pair of video sequences as the input. The features of these networks are merged at the final layer through the similarity function. Two different architectures viz., one with 2D convolutional filters and the other with depth wise convolutional filters, each operated on two set of features, RGB and optical flow features are developed. Experimental results demonstrate the effectiveness and feasibility of the proposed work compared to state-of-the methods.

Research paper thumbnail of Spiking neural network based on joint entropy of optical flow features for human action recognition

The Visual Computer, 2020

In the recent past, human action recognition is inviting increased attention in the automated vid... more In the recent past, human action recognition is inviting increased attention in the automated video surveillance systems. An efficient human action classification technique in an unconstrained environment is proposed in this paper. A novel descriptor relying on joint entropy of difference in magnitude and orientation of the optical flow feature is developed in order to model human actions. Initially, flow feature is computed using Pyramid–Warping–Cost volume Network (PWCNet), considering every two consecutive frames. Then, the feature descriptor is formed based on the joint entropy of difference in flow magnitude and orientation collected from the regular grid of each frame in the action sequence. Finally, in order to incorporate long-term temporal dependency, a spiking neural network is embedded to aggregate the information across the frames. Different optimization techniques and different types of hidden nodes are utilized in the spiking neural network to analyze the performance of the proposed work. Extensive experiments on the benchmark datasets for human action recognition show the efficacy of the proposed method.

Research paper thumbnail of Human interaction recognition through deep learning network

2016 IEEE International Carnahan Conference on Security Technology (ICCST), 2016

This paper provides an efficient framework for recognizing human interactions based on deep learn... more This paper provides an efficient framework for recognizing human interactions based on deep learning based architecture. The Harris corner points and the histogram form the feature vector of the spatiotemporal volume. The feature vector extraction is restricted to the region of interaction. A stacked autoencoder configuration is embedded in the deep learning framework used for classification. The method is evaluated on the benchmark UT interaction dataset and average recognition rates as high as 95% and 88% are obtained on setl and set2 respectively.

Research paper thumbnail of Particle swarm optimization with deep learning for human action recognition

Multimedia Tools and Applications, 2020

A novel method for human action recognition using a deep learning network with features optimized... more A novel method for human action recognition using a deep learning network with features optimized using particle swarm optimization is proposed. The binary histogram, Harris corner points and wavelet coefficients are the features extracted from the spatiotemporal volume of the video sequence. In order to reduce the computational complexity of the system, the feature space is reduced by particle swarm optimization technique with the multi-objective fitness function. Finally, the performance of the system is evaluated using deep learning neural network (DLNN). Two autoencoders are trained independently and the knowledge embedded in the autoencoders are transferred to the proposed DLNN for human action recognition. The proposed framework achieves an average recognition rate of 91% on UT interaction set 1, 88% on UT interaction set 2, 91% on SBU interaction dataset and 94% on Weizmann dataset.

Research paper thumbnail of Detecting A Child’s Stimming Behaviours for Autism Spectrum Disorder Diagnosis using Rgbpose-Slowfast Network

2022 IEEE International Conference on Image Processing (ICIP), Oct 16, 2022

Autism Spectrum Disoder (ASD) is a neurodevelopmental disorder characterized by (a) persistent de... more Autism Spectrum Disoder (ASD) is a neurodevelopmental disorder characterized by (a) persistent deficits in social communication and interaction, and (b) presence of restrictive, repetitive patterns of behaviours, interests or activities. The stereotyped repetitive behaviours are also referred to as stimming behaviours. We propose a deep learning based approach to automatically predict a child's stimming behaviours from videos recorded in unconstrained conditions. The child's region in the video is tracked and its skeletal joints are derived using the pose estimator. The heatmap representation of skeletal joints and the raw video signals are used as inputs to the two pathways of the RGBPose-SlowFast deep network to model stimming behaviours. The proposed model is evaluated using the publicly available Self-Stimulatory Behaviour Dataset (SSBD) of stimming behaviours. The generalization ability of the model is validated using the Autism dataset containing child's motor actions. Our experiments demonstrate state-of-the-art results on both datasets.

Research paper thumbnail of Detecting A Child’s Stimming Behaviours for Autism Spectrum Disorder Diagnosis using Rgbpose-Slowfast Network

2022 IEEE International Conference on Image Processing (ICIP)

Autism Spectrum Disoder (ASD) is a neurodevelopmental disorder characterized by (a) persistent de... more Autism Spectrum Disoder (ASD) is a neurodevelopmental disorder characterized by (a) persistent deficits in social communication and interaction, and (b) presence of restrictive, repetitive patterns of behaviours, interests or activities. The stereotyped repetitive behaviours are also referred to as stimming behaviours. We propose a deep learning based approach to automatically predict a child's stimming behaviours from videos recorded in unconstrained conditions. The child's region in the video is tracked and its skeletal joints are derived using the pose estimator. The heatmap representation of skeletal joints and the raw video signals are used as inputs to the two pathways of the RGBPose-SlowFast deep network to model stimming behaviours. The proposed model is evaluated using the publicly available Self-Stimulatory Behaviour Dataset (SSBD) of stimming behaviours. The generalization ability of the model is validated using the Autism dataset containing child's motor actions. Our experiments demonstrate state-of-the-art results on both datasets.

Research paper thumbnail of R-STDP Based Spiking Neural Network for Human Action Recognition

Applied Artificial Intelligence

Video surveillance systems are omnipresent and automatic monitoring of human activities is gainin... more Video surveillance systems are omnipresent and automatic monitoring of human activities is gaining importance in highly secured environments. The proposed work explores the use of the bioinspired third generation neural network called spiking neural network (SNN) in order to recognize the action sequences present in a video. The SNN used in this work carries the neural information in terms of timing of spikes rather than the shape of the spikes. The learning technique used herein is reward-modulated spike timedependent plasticity (R-STDP). It is based on reinforcement learning that modulates or demodulates the synaptic weights depending on the reward or the punishment signal that it receives from the decision layer. The absence of gradient descent techniques and external classifiers makes the system computationally efficient and simple. Finally, the performance of the network is evaluated on the two benchmark datasets, viz., Weizmann and KTH datasets.

Research paper thumbnail of Light weight convolutional models with spiking neural network based human action recognition

Journal of Intelligent & Fuzzy Systems, 2020

Though deep learning networks have proven ability to perform video analytics in complex environme... more Though deep learning networks have proven ability to perform video analytics in complex environments, there is an increased attention towards the development of compact networks which would facilitate edge processing and the result of which have yielded high performance compressed deep learning networks such as, MobileNet, PWCNet and BindsNet. In the work proposed herein, a dual network configuration is used for human action recognition, wherein, the MobileNet captures the spatial appearance of the action sequences and the PWCNet is used to extract the motion vectors. A novel Spiking Neural Network (SNN) based configuration is used as the classifier and the SNN implementation is based on BindsNet. The proposed configuration is experimentally validated on challenging datasets, viz., HMDB51 and UCF101. The experimental results demonstrate that the proposed work is superior to the state-of-the-art techniques and comparable in few cases.

Research paper thumbnail of Vision based human fall detection with Siamese convolutional neural networks

Journal of Ambient Intelligence and Humanized Computing, 2021

Fall detection is drawing serious attention all across the globe, as unattended fall of senior ci... more Fall detection is drawing serious attention all across the globe, as unattended fall of senior citizens creates long lasting injuries. This necessitates the deployment of automatic fall detection systems to facilitate smart care health environments for the elderly people living in various settings, viz., living independently in their homes, hospitalized or living in care homes. The proposed work employs Siamese network with one shot classification for human fall detection. Unlike the neural network that classifies the video sequences, this network learns to differentiate the video sequences by computing the similarity score. The network contains two identical CNNs, receiving pair of video sequences as the input. The features of these networks are merged at the final layer through the similarity function. Two different architectures viz., one with 2D convolutional filters and the other with depth wise convolutional filters, each operated on two set of features, RGB and optical flow features are developed. Experimental results demonstrate the effectiveness and feasibility of the proposed work compared to state-of-the methods.

Research paper thumbnail of Spiking neural network based on joint entropy of optical flow features for human action recognition

The Visual Computer, 2020

In the recent past, human action recognition is inviting increased attention in the automated vid... more In the recent past, human action recognition is inviting increased attention in the automated video surveillance systems. An efficient human action classification technique in an unconstrained environment is proposed in this paper. A novel descriptor relying on joint entropy of difference in magnitude and orientation of the optical flow feature is developed in order to model human actions. Initially, flow feature is computed using Pyramid–Warping–Cost volume Network (PWCNet), considering every two consecutive frames. Then, the feature descriptor is formed based on the joint entropy of difference in flow magnitude and orientation collected from the regular grid of each frame in the action sequence. Finally, in order to incorporate long-term temporal dependency, a spiking neural network is embedded to aggregate the information across the frames. Different optimization techniques and different types of hidden nodes are utilized in the spiking neural network to analyze the performance of the proposed work. Extensive experiments on the benchmark datasets for human action recognition show the efficacy of the proposed method.

Research paper thumbnail of Human interaction recognition through deep learning network

2016 IEEE International Carnahan Conference on Security Technology (ICCST), 2016

This paper provides an efficient framework for recognizing human interactions based on deep learn... more This paper provides an efficient framework for recognizing human interactions based on deep learning based architecture. The Harris corner points and the histogram form the feature vector of the spatiotemporal volume. The feature vector extraction is restricted to the region of interaction. A stacked autoencoder configuration is embedded in the deep learning framework used for classification. The method is evaluated on the benchmark UT interaction dataset and average recognition rates as high as 95% and 88% are obtained on setl and set2 respectively.

Research paper thumbnail of Particle swarm optimization with deep learning for human action recognition

Multimedia Tools and Applications, 2020

A novel method for human action recognition using a deep learning network with features optimized... more A novel method for human action recognition using a deep learning network with features optimized using particle swarm optimization is proposed. The binary histogram, Harris corner points and wavelet coefficients are the features extracted from the spatiotemporal volume of the video sequence. In order to reduce the computational complexity of the system, the feature space is reduced by particle swarm optimization technique with the multi-objective fitness function. Finally, the performance of the system is evaluated using deep learning neural network (DLNN). Two autoencoders are trained independently and the knowledge embedded in the autoencoders are transferred to the proposed DLNN for human action recognition. The proposed framework achieves an average recognition rate of 91% on UT interaction set 1, 88% on UT interaction set 2, 91% on SBU interaction dataset and 94% on Weizmann dataset.