Automatic speech recognition- an approach for designing inclusive games (original) (raw)
Related papers
Human-Computer Interaction – INTERACT 2011, 2011
We conducted a quantitative experiment to determine the performance characteristics of non-speech vocalization for discrete input generation in comparison to existing speech and keyboard input methods. The results from the study validated our hypothesis that non-speech voice input can offer significantly faster discrete input compared to a speech-based input method by as much as 50%. Based on this and other promising results from the study, we built a prototype system called the Voice Game Controller that augments traditional speech-based input methods with non-speech voice input methods to make computer games originally designed for the keyboard and mouse playable using voice only. Our preliminary evaluation of the prototype indicates that the Voice Game Controller greatly expands the scope of computer games that can be played hands-free using just voice, to include games that were difficult or impractical to play using previous speech-based methods.
Asian Journal of Computer Science and Technology, 2020
Computer games are part of our modern way of life. However, sure classes of human beings are excluded from this shape of leisure and social interaction because they're not able to use the interface of video games. The cause for this may be deficits in motor manage, vision or listening to. By the use of automated speech popularity structures (ASR), voice pushed instructions may be used to control the game, which may for that reason open up the opportunity for people with motor system trouble to be blanketed in- game communities. The aim of this paper is to discover a preferred way of using voice instructions in games which makes use of a speech popularity gadget within the backend and that can be universally applied for designing inclusive games.
Non-speech input and speech recognition for real-time control of computer games
Proceedings of the 8th international ACM SIGACCESS conference on Computers and accessibility - Assets '06, 2006
This paper reports a comparison of user performance (time and accuracy) when controlling a popular arcade game of Tetris using speech recognition or non-speech (humming) input techniques. The preliminary qualitative study with seven participants shows that users were able to control the game using both methods but required more training and feedback for the humming control. The revised interface, which implemented these requirements, was positively responded by users. The quantitative test with 12 other participants shows that humming excelled in both time and accuracy, especially over longer distances and advanced difficulty levels.
Speaker Dependent Speech Recognition in Computer Game Control
International Journal of Computer Applications, 2017
p laying has become an integral part of people's lives since the beginning of time, and education games have become an important part of the education process in childhood, for school students and even for university students. Insertion of the voice commands in education games considered a big challenge especially regarding the speech accuracy and rapid response, to achieve this goal an educational game was designed aimed to teach students of Computer Science the fundamental concepts of " logic ", and to enable the game to allow speech input, the game should include the speech recognition system, to build that system, in this study three algorithms for feature extraction are used (MFCC , PLP and Rasta-PLP) with three VQ Code Book generation algorithms (LBG, LBG-PSO and LBG-PSOGA) were studied and applied, and was tested on 864 sound files for different peoples (4 male, 5 female), their ages between (16-30) year, through the results it was noted that when MFCC technique with LBG-PSOGA algorithm was used higher speech accuracy up to 98.5 % was obtained compared to other algorithms and techniques.
2006
An investigation is presented which proposes the development of artificially intelligent agents as a way to improve interactive music and sound design in the Computer Game genre. An analysis is given of current and near-future sound design technology and methodology within the games industry including discussion of Microsoft's DirectMusic Producer, 3D audio, interactive sequencing and algorithmic composition. It is found that, although advances have been made to improve audio for games, the challenge of creating music and sound design that can be performed interactively within a computer game places severe limits on composers and sound designers. Comparisons are made with the Film genre which is proposed as a possible role model for interactive sound design. Techniques such as asynchronicity, structure, transitions and association are discussed and proposed as key techniques used in film sound but which are lacking in game audio. Artificial Intelligence (AI) programming is proposed as a possible solution for the problems associated with interactive sound design. A number of AI techniques are discussed and the Soar architecture is proposed as a good platform for the development of game audio AI. An introduction to the design, functionality and programming of Soar is given and features such as production rules, state based mechanics and semantic net style knowledge representation are identified as being highly beneficial to audio agent design. Previous and current research using Soar for game AI development is reviewed and a discussion of Soar Quakebot and Interactive Drama Architecture presented with regards to how their designs could be used to develop an audio agent featuring human behaviour modelling, anticipation and state based plot structure. iii 'Eaters' game environment Figure 3.9 Flow diagram of SML test program viii
2011 Brazilian Symposium on Games and Digital Entertainment, 2011
Mainstream games usually lack support for accessibility to deaf and hard of hearing people. The popular FPS game Half-Life 2 is an exception, in that it provides well constructed closed captions to players. In this paper, we performed a semiotic inspection on Half-Life 2, seeking to identify which strategies were used to convey information through audio. We also evaluated how the loss of information in each of them may impact players' experience. Our findings reveal that six different strategies are used and how they may compromise player experience.
Automatic Speech Recognition for Polish in a Computer Game Interface
The paper describes the process of designing a task-oriented continuous speech recognition system for Polish, based on CMU Sphinx 4, to be used in the voice interface of a computer game called Rally Navigator. The concept of the game is presented, the stages of creating the acoustic model and the language model are described in details, taking into account the specificity of the Polish language. Results of initial experiments show that as little as 15 minutes of audio material is enough to produce a highly effective single-speaker command-and-control ASR system for the computer game, providing the sentence recognition accuracy of 97.6%. Results of the system adaptation for a new speaker are presented. It is also showed that the statistic trigram-based language model with negative trigrams yields the best recognition results.