Speech Corpus Development for a Speaker Independent Spontaneous Urdu Speech Recognition System (original) (raw)

Urdu Speech Corpus and Preliminary Results on Speech Recognition

Language resources for Urdu language are not well developed. In this work, we summarize our work on the development of Urdu speech corpus for isolated words. The Corpus comprises of 250 isolated words of Urdu recorded by ten individuals. The speakers include both native and non-native, male and female individuals. The corpus can be used for both speech and speaker recognition tasks. We also report our results on automatic speech recognition task for the said corpus. The framework extracts Mel Frequency Cepstral Coefficients along with the velocity and acceleration coefficients, which are then fed to different classifiers to perform recognition task. The classifiers used are Support Vector Machines, Random Forest and Linear Discriminant Analysis. Experimental results show that the best results are provided by the Support Vector Machines with a test set accuracy of 73%. The results reported in this work may provide a useful baseline for future research on automatic speech recognition of Urdu.

Design and development of phonetically rich Urdu speech corpus

Oriental COCOSDA International Conference on Speech Database and Assessments, 2009

Phonetically rich speech corpora play a pivotal role in speech research. The significance of such resources becomes crucial in the development of Automatic Speech Recognition systems and Text to Speech systems. This paper presents details of designing and developing an optimal context based phonetically rich speech corpus for Urdu that will serve as a baseline model for training a Large Vocabulary Continuous Speech Recognition system for Urdu language.

Speaker Independent Urdu Speech Recognition Using HMM

Automatic Speech Recognition (ASR) is one of the advanced fields of Natural Language Processing (NLP). Recent past has witnessed valuable research activities in ASR in English, European and East Asian languages. But unfortunately South Asian Languages in general and " Urdu " in particular have received very less attention. In this paper we present an approach to develop an ASR system for Urdu language. The proposed system is based on an open source speech recognition framework called Sphinx4 which uses statistical based approach (Hidden Markov Model) for developing ASR system. We present a Speaker Independent ASR system for small sized vocabulary, i.e. fifty two isolated most spoken Urdu words and suggest that this research work will form the basis to develop medium and large size vocabulary Urdu speech recognition system.

A Speech Recognition System for Urdu Language

Communications in Computer and Information Science, 2009

This paper presents a speech processing and recognition system for individually spoken Urdu language words. The speech feature extraction was based on a dataset of 150 different samples collected from 15 different speakers. The data was pre-processed using normalization and by transformation into frequency domain by (discrete Fourier transform). The speech recognition feed-forward neural models were developed in MATLAB. The models exhibited reasonably high training and testing accuracies. Details of MATLAB implementation are included in the paper for use by other researchers in this field. Our ongoing work involves use of linear predictive coding and cepstrum analysis for alternative neural models. Potential applications of the proposed system include telecommunications, multi-media, and voice-activated tele-customer services.

A Medium Vocabulary Urdu Isolated Words Balanced Corpus for Automatic Speech Recognition

Abstract—The role of a standard database in conducting and evaluating the speech recognition research is two-fold. Firstly, it provides a standard platform for the research by providing a balance amongst various aspects of speech recognition such as gender, dialect, and age. Secondly, it provides a common platform for comparing the performance of various speech recognition approaches. This paper presents the development of a Medium Vocabulary Speech Corpus for Urdu Language.

AutoSSR: an efficient approach for automatic spontaneous speech recognition model for the Punjabi Language

Soft Computing, 2020

In this article, the authors have presented the design and development of automatic spontaneous speech recognition of the Punjabi language. To dimensions up to the natural speech recognizer, the very large vocabulary Punjabi text corpus has been taken from a Punjabi interview's speech corpus, presentations, etc. Afterward, the Punjabi text corpus has been cleaned by using the proposed corpus optimization algorithm. The proposed automatic spontaneous speech model has been trained with 13,218 of Punjabi words and more than 200 min of recorded speech. The research work also confirmed that the 2,073,456 unique in-word Punjabi tri-phoneme combinations present in the dictionary comprise of 131 phonemes. The performance of the proposed model has grown increasingly to 87.10% sentence-level accuracy for 2381 Punjabi trained sentences and word-level accuracy of 94.19% for 13,218 Punjabi words. Simultaneously, the word error rate has been reduced to 5.8% for 13,218 Punjabi words. The performance of the proposed system has also been tested by using other parameters such as overall likelihood per frame and convergence ratio on various iterations for different Gaussian mixtures.

An ASR System for Spontaneous Urdu Speech

Submitted to O- …, 2010

Center for Research in Urdu Language Processing (CRULP; www.crulp.org)1 at NUCES is currently working on a project entitled Telephone-based Speech Interfaces for Access to Information by Non-literate Users in collaboration with Carnegie Mellon University. The goal of this ...

Automatic Urdu Speech Recognition using Hidden Markov Model

—In this paper, we present an approach to develop an automatic speech recognition (ASR) system of Urdu isolated words. Our experimentation is based on a medium vocabulary speech corpus of Urdu, consisting of 250 words. We develop our approach using the open source Sphinx toolkit. Using this platform, we extract the Mel Frequency Cepstral Coefficients (MFCC) features and build a Hidden Markov Model to perform recognition task. We report percentage accuracy for two different experiments based on 100 and 250 words respectively. Experimental results suggest that better recognition accuracy has been achieved with this approach, as compared to the previous results reported on this corpus.

The development of isolated words corpus of Pashto for the automatic speech recognition research

2012 International Conference of Robotics and Artificial Intelligence, 2012

The availability of standard speech database is of paramount importance in the automatic speech recognition (ASR) research in the context of providing a baseline for comparing the performance of automatic speech recognition approaches. This paper presents the development of a Medium-Vocabulary Speech Corpus for Pashto language. The vocabulary encompasses 161 isolated words of Pashto language, consisting of most frequently used words of Pashto language, names of the days of the week and digits from 0 to 25. The words were uttered by 30 speakers of different ages and genders, including both native and non-native speakers of Pashto language. Recording of the corpus was performed in a noise free office environment. The Corpus developed is then used for the development of an automatic speech recognition system for Pashto language.

Design of an Urdu speech recognizer based upon acoustic phonetic modeling approach

2004

Airtonlatic Speech Kecognifion (ASR) is one of (he most developing.fields qf ihe mudem science. It has nzan!) impurtunt upp/icariuns in oiir saciai I@ UJ weli as nzuizy weus uf scientijc disciplines like compriter science, and media. The miin aim ofthis paper is to analyze and implement different techniques ofspeech Recognition like Pattern-Matching or Acoustic phonetic Modeling to Urdu Lungtiage. A profotype has heen developed which recognises the cnnfznirorts Urd1 Speech with 55 In 60% acctcrucy.