Design and development of phonetically rich Urdu speech corpus (original) (raw)

Abstract

Phonetically rich speech corpora play a pivotal role in speech research. The significance of such resources becomes crucial in the development of Automatic Speech Recognition systems and Text to Speech systems. This paper presents details of designing and developing an optimal context based phonetically rich speech corpus for Urdu that will serve as a baseline model for training a Large Vocabulary Continuous Speech Recognition system for Urdu language.

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (24)

  1. D. Jurafsky, J. H. Martin, A. Kehler, K. Vander Linden and N. Ward, Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition, 2000
  2. CMUSphinx: The Carnegie Mellon Sphinx Project, http://cmusphinx.sourceforge.net/html/cmusphinx.php
  3. Sphinx-4 -A speech recognizer written entirely in the Java (TM) programming language, http://cmusphinx.sourceforge.net/sphinx4/
  4. Speech at CMU, http://www.speech.cs.cmu.edu/
  5. S. Hussain. 1997. Phonetic Correlates of Lexical Stress in Urdu. Unpublished Doctoral Dissertation, Northwestern University, Evanston, USA.
  6. S. Hussain. 2003. www.LICT4D.aisa/Fonts/ Nafees_Nastalique. Proceedings of 12th AMIC Annual Conference on E-Worlds: Governments, Business and Civil Society, Asian Media Information Center, Singapore.
  7. M. Afzal and S. Hussain. 2001. Urdu Computing Standards: Development of Urdu Zabta Takhti (UZT 1.01). Proceedings of IEEE International Multi-topic Conference, Lahore, Pakistan.
  8. S. Hussain, and M. Afzal. 2001. Urdu Computing Standards: Urdu Zabta Takhti (UZT 1.01). Proceedings of IEEE International Multi-topic Conference, Lahore, Pakistan.
  9. S. Hussain, Letter to Sound Rules for Urdu Text to Speech System, Proceedings of Workshop on "Computational Approaches to Arabic Script-based Languages", COLING 2004, Geneva, Switzerland (2004).
  10. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, Introduction to Algorithms, Second Edition, The MIT Press, Massachusetts Institute of Technology, Cambridge Massachusetts, 2001.
  11. S. T. Abate, W. Menzel, and B. Tafila, An Amharic speech corpus for large vocabulary continuous speech recognition, 2005.
  12. G. Anumanchipalli, R. Chitturi, S. Joshi, R. Kumar, S. P. Singh, R. N. V. Sitaram, and S. P. Kishore, Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems.
  13. D. Binnenpoorte, C. Cucchiarini, H. Strik, and L. Boves, Improving automatic phonetic transcription of spontaneous speech through variant-based pronunciation variation modelling, 2004, pp. 681-684.
  14. B. Bozkurt, O. Ozturk, and T. Dutoit, Text design for TTS speech corpus building using a modified greedy selection, 2003.
  15. V. Chourasia, K. Samudravijaya, and M. Chandwani, Phonetically Rich Hindi Sentence Corpus for Creation of Speech Database, Proc. O-COCOSDA, pp. 132-137, 2005.
  16. V. Digalakis, D. Oikonomidis, D. Pratsolis, N. Tsourakis, C. Vosnidis, N. Chatzichrisafis, and V. Diakoloukas, Large Vocabulary Continuous Speech Recognition in Greek: Corpus and an Automatic Dictation System, 2003.
  17. P. A. Heeman, The American English SALA-II Data Collection, 2004.
  18. M. Ijaz and S. Hussain, Corpus Based Urdu Lexicon Development, 2007.
  19. A. Li, F. Zheng, W. Byrne, P. Fung, T. Kamm, Y. Liu, Z. Song, U. Ruhi, V. Venkataramani, and X. X. Chen, CASS: A phonetically transcribed corpus of Mandarin spontaneous speech, 2000.
  20. G. Raškinis, Building medium-vocabulary isolated-word Lithuanian HMM speech recognition system, Informatica, vol. 14, pp. 75-84, 2003.
  21. A. L. Ronzhin, R. M. Yusupov, I. V. Li, and A. B. Leontieva, Survey of Russian Speech Recognition Systems.
  22. L. Villaseñor-Pineda, M. Montes-y-Gomez, D. Vaufreydaz, and J. F. Serignat, Experiments on the Construction of a Phonetically Balanced Corpus from the Web, Lecture notes in computer science, pp. 416-419, 2004.
  23. Y. C. Yio, M. S. Liang, Y. C. Chiang, and R. Y. Lyu, Biphone-rich versus triphone-rich: a comparison of speech corpora in automatic speech recognition, 2005, pp. 194-197.
  24. SAMPA computer readable phonetic alphabet, www.phon.ucl.ac.uk/home/sampa/