Mohammed Kyari Mustafa | Nottingham Trent University (original) (raw)

Papers by Mohammed Kyari Mustafa

Speech recognition is a complex process that involves quite a number of steps to achieve. The com... more Speech recognition is a complex process that involves quite a number of steps to achieve. The complexity of this process becomes a bigger concern when speech recognition is to be implemented for on-device mobile speech recognition. The aim of this paper is to develop a mobile speech recognition system that does the entire processing on the mobile device. To achieve this, a two-stage speech recognition system is introduced. This system is developed from experimentation using available techniques in the literature and trying to improve on this and develop one that is suitable for a mobile device use. The first stage presents a novel Voice Activity Detection (VAD) technique that adopts Linear Predictive Coding Coefficients (LPC) that can be easily applied to on–device isolated word recognition on a mobile device. With recognition performance of 90% in comparison to a previous algorithm and recognition rate of 97.7% for female users in some of the experiments. The second stage adopts an...

Despite many years of research, Speech Recognition remains an active area of research in Artifici... more Despite many years of research, Speech Recognition remains an active area of research in Artificial Intelligence. Currently, the most common commercial application of this technology on mobile devices uses a wireless client – server approach to meet the computational and memory demands of the speech recognition process. Unfortunately, such an approach is unlikely to remain viable when fully applied over the approximately 7.22 Billion mobile phones currently in circulation. In this thesis we present an On – Device Speech recognition system. Such a system has the potential to completely eliminate the wireless client-server bottleneck. For the Voice Activity Detection part of this work, this thesis presents two novel algorithms used to detect speech activity within an audio signal. The first algorithm is based on the Log Linear Predictive Cepstral Coefficients Residual signal. These LLPCCRS feature vectors were then classified into voice signal and non-voice signal segments using a mod...

Research and Development in Intelligent Systems XXXII, 2015

This paper presents a novel Voice Activity Detection (VAD) technique that can be easily applied t... more This paper presents a novel Voice Activity Detection (VAD) technique that can be easily applied to on–device isolated word recognition on a mobile device. The main speech features used are the Linear Predictive Coding (LPC) speech features which were correlated using the standard deviation of the signal. The output was further clustered using a modified K-means algorithm. The results presented show a significant improvement to a previous algorithm which was based on the LPC residual signal with an 86.6 % recognition rate as compared to this new technique with a 90 % recognition rate on the same data. This technique was able to achieve up to 97.7 % recognition for female users in some of the experiments. The fast processing time makes it viable for mobile devices.

Research and Development in Intelligent Systems XXXI, 2014

This paper presents a review of different Voice Activity Detection (VAD) techniques that can be e... more This paper presents a review of different Voice Activity Detection (VAD) techniques that can be easily applied to On-device Isolated digit recognition on a mobile device. Techniques investigated include; Short Time Energy, Linear predictive coding residual (prediction error), Discrete Fourier Transform (DFT) based linear cross correlation and K-means clustering based VAD. The optimum VAD technique was found to be K-means clustering of Prediction error which gives a recognition rate of 86.6 %. This technique will be further used with an LPC based speech recognition algorithm for digit recognition on the mobile device.

Neural Computing and Applications, 2017

The adoption of high-accuracy speech recognition algorithms without an effective evaluation of th... more The adoption of high-accuracy speech recognition algorithms without an effective evaluation of their impact on the target computational resource is impractical for mobile and embedded systems. In this paper, techniques are adopted to minimise the required computational resource for an effective mobile-based speech recognition system. A Dynamic Multi-Layer Perceptron speech recognition technique, capable of running in real time on a state-of-the-art mobile device, has been introduced. Even though a conventional hidden Markov model when applied to the same dataset slightly outperformed our approach, its processing time is much higher. The Dynamic Multi-layer Perceptron presented here has an accuracy level of 96.94% and runs significantly faster than similar techniques.

Research and Development in Intelligent Systems XXXII, 2015

Research and Development in Intelligent Systems XXXI, 2014

Neural Computing and Applications, 2017