A Framework of Adaptive Multimodal Input for Location-Based Augmented Reality Application (original) (raw)

A usability study of multimodal input in an augmented reality environment

Virtual Reality, 2013

In this paper, we describe a user study evaluating the usability of an augmented reality (AR) multimodal interface (MMI). We have developed an AR MMI that combines free-hand gesture and speech input in a natural way using a multimodal fusion architecture. We describe the system architecture and present a study exploring the usability of the AR MMI compared with speech-only and 3D-hand-gesture-only interaction conditions. The interface was used in an AR application for selecting 3D virtual objects and changing their shape and color. For each interface condition, we measured task completion time, the number of user and system errors, and user satisfactions. We found that the MMI was more usable than the gesture-only interface conditions, and users felt that the MMI was more satisfying to use than the speech-only interface conditions; however, it was neither more effective nor more efficient than the speech-only interface. We discuss the implications of this research for designing AR MMI and outline directions for future work. The findings could also be used to help develop MMIs for a wider range of AR applications, for example, in AR navigation tasks, mobile AR interfaces, or AR game applications.

Multimodal interaction in an augmented reality scenario

Proceedings of the 6th international conference on Multimodal interfaces - ICMI '04, 2004

We describe an augmented reality system designed for online acquisition of visual knowledge and retrieval of memorized objects. The system relies on a head mounted camera and display, which allow the user to view the environment together with overlaid augmentations by the system. In this setup, communication by hand gestures and speech is mandatory as common input devices like mouse and keyboard are not available. Using gesture and speech, basically three types of tasks must be handled: (i) Communication with the system about the environment, in particular, directing attention towards objects and commanding the memorization of sample views; (ii) control of system operation, e.g. switching between display modes; and (iii) re-adaptation of the interface itself in case communication becomes unreliable due to changes in external factors, such as illumination conditions. We present an architecture to manage these tasks and describe and evaluate several of its key elements, including modules for pointing gesture recognition, menu control based on gesture and speech, and control strategies to cope with situations when vision becomes unreliable and has to be re-adapted by speech.

Multimodal interaction with a wearable augmented reality system

2006

Abstract An augmented reality system enhances a mobile user's situational awareness and provides new visualization functionality. The custom-built multimodal interface provides access to information encountered in urban environments. In this article, we detail our experiences with various input devices and modalities and discuss their advantages and drawbacks in the context of interaction tasks in mobile computing.

A Review of Multimodal Interaction Technique in Augmented Reality Environment

International Journal on Advanced Science, Engineering and Information Technology

Augmented Reality (AR) has proposed several types of interaction techniques such as 3D interactions, natural interactions, tangible interactions, spatial awareness interactions and multimodal interactions. Usually, interaction technique in AR involve unimodal interaction technique that only allows user to interact with AR content by using one modality such as gesture, speech, click, etc. Meanwhile, the combination of more than one modality is called multimodal. Multimodal can contribute to human and computer interaction more efficient and will enhance better user experience. This is because, there are a lot of issues have been found when user use unimodal interaction technique in AR environment such as fat fingers. Recent research has shown that multimodal interface (MMI) has been explored in AR environment and has been applied in various domain. This paper presents an empirical study of some of the key aspects and issues in multimodal interaction augmented reality, touching on the interaction technique and system framework. We reviewed the question of what are the interaction techniques that have been used to perform a multimodal interaction in AR environment and what are the integrated components applied in multimodal interaction AR frameworks. These two questions were used to be analysed in order to find the trends in multimodal field as a main contribution of this paper. We found that gesture, speech and touch are frequently used to manipulate virtual object. Most of the integrated component in MMI AR framework discussed only on the concept of the framework components or the information centred design between the components. Finally, we conclude this paper by providing ideas for future work involving this field.

Multimodal Fusion: Gesture and Speech Input in Augmented Reality Environment

Advances in Intelligent Systems and Computing, 2015

Augmented Reality (AR) has the capability to interact with the virtual objects and physical objects simultaneously since it combines the real world with virtual world seamlessly. However, most AR interface applies conventional Virtual Reality (VR) interaction techniques without modification. In this paper we explore the multimodal fusion for AR with speech and hand gesture input. Multimodal fusion enables users to interact with computers through various input modalities like speech, gesture, and eye gaze. At the first stage to propose the multimodal interaction, the input modalities are decided to be selected before be integrated in an interface. The paper presents several related works about to recap the multimodal approaches until it recently has been one of the research trends in AR. It presents the assorted existing works in multimodal for VR and AR. In AR, multimodal considers as the solution to improve the interaction between the virtual and physical entities. It is an ideal interaction technique for AR applications since AR supports interactions in real and virtual worlds in the real-time. This paper describes the recent studies in AR developments that appeal gesture and speech inputs. It looks into multimodal fusion and its developments, followed by the conclusion.This paper will give a guideline on multimodal fusion on how to integrate the gesture and speech inputs in AR environment.

Adaptive user interfaces in augmented reality

2003

In this position paper, we consider the issue of developing a flexible software architecture which will allow multiple interaction techniques to operate simultaneously and in parallel with one another. We describe a set of interaction techniques, identify their architectural requirements, and consider methods for prioritizing different techniques and arbitrating between them when they have conflicting objectives. The discussion is made within the context of a mobile AR system for urban situation awareness.

Mobile augmented reality interaction: an approach to the phenomenon

Informação & Sociedade 25 (3), 2015

Augmented Reality (AR) and Geolocation are a new attractive phenomenon, with great potential in the field of communication and knowledge construction. This paper is the result of an interinstitutional effort which aims at contributing to a better understanding of the use of Augmented Reality for mobile learning. Our research focuses, in the first place, on the types of mobile devices used by the younger generations and the way they are being used, with an emphasis on Geolocation and AR. We then turn to describing the potential of AR –a technology that combines a display of digital information and a view of the real world as captured by a mobile device– to create engaging learning experiences in field activities. But, prior to launching our students into this new learning scenario, which stretches far beyond the traditional classroom, it is necessary for us to locate AR resources by geoposition or tagging. Later on, those data will become accessible when within a specified range of distance from the tagged object. Alongside the benefits of its informal-formal activity structure, the possibilities for interaction are countless, including the ability to track the position and movements of all learners at any given time. On the downside, however interesting mobile Augmented Reality may result, when it comes to designing an educational activity, we need to cater for all types of mobile devices, not just the ones with a specific operating system and GPS (Global Positioning System) capabilities.

Augmented reality based user interface for mobile applications and services

2005

Traditional design of user interfaces for mobile phones is limited to a small interaction that provides only the necessary means to place phone calls or to write short messages. Such narrow activities supported via current terminals suppress users from moving towards mobile and ubiquitous computing environments of the future. Unfortunately, the next generation of user interfaces for mobile terminals seems to apply the same design patterns as commonly used for desktop computers. Whereas the desktop environment has enough resources to implement such design, capabilities of the mobile terminals fall under constraints dictated by mobility, like the size and weight. Additionally, to make mobile terminals available for everyone, users should be able to operate them with minimal or no preparation, while users of desktop computers will require certain degree of training. This research looks into how to improve the user interface of future mobile devices by using a more human-centred design....

Move the couch where?" : developing an augmented reality multimodal interface

2006 IEEE/ACM International Symposium on Mixed and Augmented Reality, 2006

This paper describes an augmented reality (AR) multimodal interface that uses speech and paddle gestures for interaction. The application allows users to intuitively arrange virtual furniture in a virtual room using a combination of speech and gestures from a real paddle. Unlike other multimodal AR applications, the multimodal fusion is based on the combination of time-based and semantic techniques to disambiguate a users speech and gesture input. We describe our AR multimodal interface architecture and discuss how the multimodal inputs are semantically integrated into a single interpretation by considering the input time stamps, the object properties, and the user context.