Navigating the Spoken Wikipedia (original) (raw)

Editing Wikipedia content by screen reader: Easier interaction with the Accessible Rich Internet Applications suite

Disability & Rehabilitation: Assistive …, 2009

Purpose. This study aims to improve Wikipedia usability for the blind and promote the application of standards relating to Web accessibility and usability. Method. First, accessibility and usability of Wikipedia home, search result and edit pages are analysed using the JAWS screen reader; next, suggestions for improving interaction are proposed and a new Wikipedia editing interface built. Most of the improvements were obtained using the Accessible Rich Internet Applications (WAI-ARIA) suite, developed by the World Wide Web Consortium (W3C) within the framework of the Web Accessibility Initiative (WAI). Last, a scenario of use compares interaction of blind people with the original and the modified interfaces. Results. Our study highlights that although all contents are accessible via screen reader, usability issues exist due to the user's difficulties when interacting with the interface. The scenario of use shows how building an editing interface with the W3C WAI-ARIA suite eliminates many obstacles that can prevent blind users from actively contributing to Wikipedia. Conclusion. The modified Wikipedia editing page is simpler to use via a screen reader than the original one because ARIA ensures a page overview, rapid navigation, and total control of what is happening in the interface.

VoicePedia: Towards Speech-based Access to Unstructured Information, Interspeech 2007

Interspeech, 2007

Currently there are no dialog systems that enable purely voice-based access to the unstructured information on websites such as Wikipedia. Such systems could be revolutionary for non-literate users in the developing world. To investigate interface issues in such a system, we developed VoicePedia, a telephone-based dialog system for searching and browsing Wikipedia. In this paper, we present the system, as well as a user study comparing the use of VoicePedia to SmartPedia, a Smartphone GUI-based alternative. Keyword entry through the voice interface was significantly faster, while search result navigation, and page browsing were significantly slower. Although users preferred the GUI-based interface, task success rates between both systems were comparable-a promising result for regions where Smartphones and data plans are not viable.

Voicepedia: Towards speech-based access to unstructured information

2007

Currently there are no dialog systems that enable purely voice-based access to the unstructured information on websites such as Wikipedia. Such systems could be revolutionary for non-literate users in the developing world. To investigate interface issues in such a system, we developed VoicePedia, a telephone-based dialog system for searching and browsing Wikipedia. In this paper, we present the system, as well as a user study comparing the use of VoicePedia to SmartPedia, a Smartphone GUI-based alternative. Keyword entry through the voice interface was significantly faster, while search result navigation, and page browsing were significantly slower. Although users preferred the GUI-based interface, task success rates between both systems were comparable -a promising result for regions where Smartphones and data plans are not viable.

What you see is (almost) what you hear: Design principles for user interfaces for accessing speech archives

1998

Despite the recent growth and potential utility of speech archives, we currently lack tools for effective archival access. Previous research on search of textual archives has assumed that the system goal should be to retrieve sets of relevant documents, leaving users to visually scan through those documents to identify relevant information. However, in previous work we show that in accessing real speech archives, it is insufficient to only retrieve "document" sets [9,10]. Users experience huge problems of local navigation in attempting to extract relevant information from within speech "documents". These studies also show that users address these problems by taking handwritten notes. These notes detail both the content of the speech and serve as indices to help access relevant regions of the archive. From these studies we derive a new principle for the design of speech access systems: What You See Is (Almost) What You Hear. We present a new user interface to a broadcast news archive, designed on that principle.

When You Can't Read It, Listen to It! An Audio-Visual Interface for Book Reading

Universal Access in Human-Computer Interaction. …, 2009

This paper presents a prototype of a mobile Digital Talking Book player, which, by combining visual and non-visual means of interaction, strives to achieve universal accessibility. Details on the non-visual aspects of the interaction, both input and output, are provided. To assess the validity of the proposed solutions, an experiment evaluates the non-visual operation of the prototype. Results show users can complete the same tasks with visual and nonvisual interaction. However, some limitations are identified, and the observations prompt a discussion on how the use of multimodal interfaces can improve their accessibility and usability.

The ears have it: a task by information structure taxonomy for voice access to Web pages

2003

We present a taxonomy of task by information structure for voice interfaces to web pages in an analogous effort to Shneiderman's taxonomy for information visualization. Our goal is to develop guidelines for the development of voice navigation of web spaces not as a replacement for visual web browsing, but instead to support focused information seeking tasks such as known-item search and directed browsing. We describe highlevel user tasks (Situate, Navigate, Query, and Details) and information structures (regions, menu/lists, text areas, repeated/structured information) that comprise the axes of our taxonomy and show how voice interfaces can support these tasks and structures.

Using Language Technology to Improve Interaction and Provide Skim Reading Abilities to Audio Information Services

… of eChallenges e-2008 …, 2008

In this paper we present language technology enhancements to audiobased information services (i.e. services where information is presented using spoken language). The enhancements presented in the paper addresses two issues for audio-based services: 1) interaction with the service is rigid and 2) the ability to listen to summaries is limited. Our developments allow for more natural and efficient control of the service and means that facilitates skim reading. Using speech dialogue instead of traditional buttons provides means for more advanced navigation in the audio material. Vector space techniques are used to collect the most relevant sentences in a text and allows for skim reading of varying depth.

SCAN: designing and evaluating user interfaces to support retrieval from speech archives

1999

Previous examinations of search in textual archives have assumed that users first retrieve a ranked set of documents relevant to their query, and then visually scan through these documents, to identify the information they seek. While document scanning is possible in text, it is much more laborious in speech archives, due to the inherently serial nature of speech. Yet, in developing tools for speech access, little attention has so far been paid to users' problems in scanning and extracting information from within "speech documents".

A Voice-Driven Web Browser for Blind People

Lecture Notes in Computer Science, 2002

A small self-voicing Web browser designed for blind users is presented. The Web browser was built from the GTK Web browser Dillo, which is a free software project in terms of the GNU general public license. Additional functionality has been introduced to this original browser in form of different modules. The browser operates in two different modes, browsing mode and dialogue mode. In browsing mode user navigates through structure of Web pages using mouse and/or keyboard. When in dialogue mode, the dialogue module offers different actions and the user chooses between them using either keyboard or spoken-commands which are recognized by the speech-recognition module. The content of the page is presented to the user by screen-reader module which uses text-tospeech module for its output.