The Added Value of Multimodality in the NESPOLE! Speech-to-Speech Translation System: an Experimental Study (original) (raw)

Multimodal interfaces, which combine two or more input modes (speech, pen, touch…), are expected to be more efficient, natural and usable than single-input interfaces. However, the advantage of multimodal input has only been ascertained in highly controlled experimental conditions ; in particular, we lack data about what happens with 'real' human-human, multilingual communication systems. In this work we discuss the results of an experiment aiming to evaluate the added value of multimodality in a "true" speech-to-speech translation system, the NESPOLE! system, which provides for multilingual and multimodal communication in the tourism domain, allowing users to interact through the internet sharing maps, web-pages and pen-based gestures. We compared two experimental conditions differing as to whether multimodal resources were available: a speech-only condition (SO), and a multimodal condition (MM). Most of the data show tendencies for MM to be better than SO.