Urdu speech corpus for travel domain (original) (raw)
Related papers
IOSR Journal of Computer Engineering, 2014
The paper represents the brief information about developing speech database in Marathi language for Travel purpose in Aurangabad District. Development of speech database is very primary requirement for developing an Automatic Speech Recognition System. The accuracy of speech recognition depends on the quality of the speech data recorded and the algorithms implemented for the development of ASR. The data collection procedure from various speakers from Aurangabad district is described in the paper for developing ASR system in Marathi language for travel domain.
Design and Development of Speech Database for Travel Purpose in Marathi
2014
The paper represents the brief information about developing speech database in Marathi language for Travel purpose in Aurangabad District. Development of speech database is very primary requirement for developing an Automatic Speech Recognition System. The accuracy of speech recognition depends on the quality of the speech data recorded and the algorithms implemented for the development of ASR. The data collection procedure from various speakers from Aurangabad district is described in the paper for developing ASR system in Marathi language for travel domain.
Urdu Speech Corpus and Preliminary Results on Speech Recognition
Language resources for Urdu language are not well developed. In this work, we summarize our work on the development of Urdu speech corpus for isolated words. The Corpus comprises of 250 isolated words of Urdu recorded by ten individuals. The speakers include both native and non-native, male and female individuals. The corpus can be used for both speech and speaker recognition tasks. We also report our results on automatic speech recognition task for the said corpus. The framework extracts Mel Frequency Cepstral Coefficients along with the velocity and acceleration coefficients, which are then fed to different classifiers to perform recognition task. The classifiers used are Support Vector Machines, Random Forest and Linear Discriminant Analysis. Experimental results show that the best results are provided by the Support Vector Machines with a test set accuracy of 73%. The results reported in this work may provide a useful baseline for future research on automatic speech recognition of Urdu.
Design and development of phonetically rich Urdu speech corpus
Oriental COCOSDA International Conference on Speech Database and Assessments, 2009
Phonetically rich speech corpora play a pivotal role in speech research. The significance of such resources becomes crucial in the development of Automatic Speech Recognition systems and Text to Speech systems. This paper presents details of designing and developing an optimal context based phonetically rich speech corpus for Urdu that will serve as a baseline model for training a Large Vocabulary Continuous Speech Recognition system for Urdu language.
A Speech Recognition System for Urdu Language
Communications in Computer and Information Science, 2009
This paper presents a speech processing and recognition system for individually spoken Urdu language words. The speech feature extraction was based on a dataset of 150 different samples collected from 15 different speakers. The data was pre-processed using normalization and by transformation into frequency domain by (discrete Fourier transform). The speech recognition feed-forward neural models were developed in MATLAB. The models exhibited reasonably high training and testing accuracies. Details of MATLAB implementation are included in the paper for use by other researchers in this field. Our ongoing work involves use of linear predictive coding and cepstrum analysis for alternative neural models. Potential applications of the proposed system include telecommunications, multi-media, and voice-activated tele-customer services.
District names speech corpus for Pakistani Languages
2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015
This paper presents a speech corpus that is developed for Urdu automatic speech recognition (ASR) system. The corpus comprises of single word utterances fixed vocabulary consisting of district names of Pakistan. The data is recorded over a telephone channel from all over Pakistan to cover six major accents; Punjabi, Urdu, Saraiki, Pashto, Sindhi, and Balochi. The data was collected in challenging acoustic environments; the major issues were silence, background noise and alternate pronunciations, which can affect the performance of the system. In order to address these issues, comprehensive data verification and cleaning guidelines are presented. The proposed process serves as a data preprocessing step for the development of ASR, which is successfully integrated in an Urdu dialog system to provide weather information of Pakistan.
Speech Corpus Development for a Speaker Independent Spontaneous Urdu Speech Recognition System
2010
This paper reports the design and development of an 82 speaker Urdu speech corpus for speaker independent spontaneous speech recognition using the CMU Sphinx Open Source Toolkit for Speech Recognition. The corpus consists of 45 hours of spontaneous and read speech data from 82 speakers (42 male and 40 female), recorded over a microphone and a telephone line. The speech was collected from speakers ranging from 20 to 55 years of age. Recording sessions were conducted in office and home environments.
A Medium Vocabulary Urdu Isolated Words Balanced Corpus for Automatic Speech Recognition
Abstract—The role of a standard database in conducting and evaluating the speech recognition research is two-fold. Firstly, it provides a standard platform for the research by providing a balance amongst various aspects of speech recognition such as gender, dialect, and age. Secondly, it provides a common platform for comparing the performance of various speech recognition approaches. This paper presents the development of a Medium Vocabulary Speech Corpus for Urdu Language.
Indian Language Speech Database: A Review
International Journal of Computer Applications, 2012
Speech is the most prominent and natural form of communication between humans. Human beings have long been motivated to create computer that can understand and talk like human. When the research tries to develop certain recognition system they require certain previously stored data i.e. database for respective recognition system. There are various speech databases available for European Language but very less for Indian Language. In this paper we discuss the various Speech Database developed in different Indian Languages for speech recognition system & Text to Speech System.
Chhattisgarhi speech corpus for research and development in automatic speech recognition
Automatic speech recognition (ASR) is a computerized interface which allows humans to communicate with machine in a way of its natural conversation. ASR has wide range of applications in various fields such as language development in young children, telecommunications, as an assistive device for hearing impaired etc. Performance of ASR system is greatly influenced by the database used for its implementation. In this paper, we are discussing about building a speech corpus for a rare but important Indian dialect Chhattisgarhi. This speech corpus consists of 100 unique isolated words and four speech scripts aggregating 67 sentences, recorded from total 478 native speakers. These words were selected from English to Chhattisgarhi dictionary published by Chhattisgarh Rajbhasha Aayog and scripts from Chhattisgarhi literature and newspaper articles. This dataset has been collected travelling over 60% geographical area of the Chhattisgarh state. Finally, a valuable speech corpus for the first time have been prepared for Chhattisgarhi with an aim to enhance the speech research. The successful extermination of speech recognition for both isolated and continuous speech samples have been demonstrated on the prepared database.