A model for performance evaluation of interactive systems (original) (raw)

An automated method for studying interactive systems

1999

Information Retrieval experiments rarely examine more than a small number of user or system characteristics because of the limited availability of human subjects. In this article we present an interaction model, and based on that an experimental method. This automated method helps identify which user and system variables are relevant, which are independent of other variables, and which ranges of the variables are most important.

New measures for the evaluation of interactive information retrieval systems: Normalized task completion time and normalized user effectiveness

Proceedings of the American Society for Information Science and Technology, 2010

User satisfaction, though difficult to measure, is the main goal of Information Retrieval (IR) systems. In recent years, as Interactive Information Retrieval (IIR) systems have become increasingly popular, user effectiveness also has become critical in evaluating IIR systems. However, existing measures in IR evaluation are not particularly suitable for gauging user satisfaction and user effectiveness. In this paper, we propose two new measures to evaluate IIR systems, the Normalized Task Completion Time (NT) and the Normalized User Effectiveness (NUE). The two measures overcome limitations of existing measures and are efficient to calculate in that they do not need a large pool of search tasks. A user study was conducted to investigate the relationships between the two measures and the user satisfaction and effectiveness of a given IR system. The learning effects described by NT, NUE, and the task completion time were also studied and compared. The results show that NT is strongly correlated with user satisfaction, NUE is a better indicator of system effectiveness than task completion time, and both new measures are superior to task completion time in describing the learning effect of the given IR system.

The development of a method for the evaluation of interactive information retrieval systems

Journal of Documentation, 1997

The paper describes the ideas and assumptions underlying the development of a new method for the evaluation and testing of interactive information retrieval (IR) systems, and reports on the initial tests of the proposed method. The method is designed to collect different types of empirical data, i.e. cognitive data as well as traditional systems performance data. The method is based on the novel concept of a 'simulated work task situation' or scenario and the involvement of real end users. The method is also based on a mixture of simulated and real information needs, and involves a group of test persons as well as assessments made by individual panel members. The relevance assessments are made with reference to the concepts of topical as well as situational relevance. The method takes into account the dynamic nature of information needs which are assumed to develop over time for the same user, a variability which is presumed to be strongly connected to the processes of relevance assessment.

A Framework to Evaluate Interface Suitability for a Given Scenario of Textual Information Retrieval

J. Univers. Comput. Sci., 2011

Visualization of search results is an essential step in the textual Information Retrieval (IR) process. Indeed, Information Retrieval Interfaces (IRIs) are used as a link between users and IR systems, a simple example being the ranked list proposed by common search engines. Due to the importance that takes visualization of search results, many interfaces have been proposed in the last decade (which can be textual, 2D or 3D IRIs). Two kinds of evaluation methods have been developed: (1) various evaluation methods of these interfaces were proposed aiming at validating ergonomic and cognitive aspects; (2) various evaluation methods were applied on information retrieval systems (IRS) aiming at measuring their effectiveness. However, as far as we know, these two kinds of evaluation methods are disjoint. Indeed, considering a given IRI associated to a given IRS, what happens if we associate this IRI to another IRS not having the same effectiveness. In this context, we propose an IRI evalu...

Experimental components for the evaluation of interactive information retrieval systems

Journal of Documentation, 2000

This paper presents a set of basic components which constitutes the experimental setting intended for the evaluation of interactive information retrieval (IIR) systems, the aim of which is to facilitate evaluation of IIR systems in a way which is as close as possible to realistic IR processes. The experimental setting consists of three components: (1) the involvement of potential users as test persons;

Human Factors and User Assistance in Interactive Computing Systems: An Introduction

IEEE Transactions on Systems, Man, and Cybernetics, 1982

Ahstract-]Fhe need to improve and simplify interactive computing svstems has led to the study of the human factors of these systems. Out of these studies and a general interest in ease of use has come a variety of guidelines and techniques for improving human-machine interfaces. Some of the most important.techniques allow a user to obtain assistance automatically while using a computer svstem. An introduction to the problems, methods, and results in human factors and user assistance for interactive computer systems are provided in this paper and this issue.

Modelling Selection Tasks and Assessing Performance in Web Interaction

The paper suggests a model for selection tasks that are widespread in modern interaction on the WWW, as well as the means to evaluate performance as human processor throughput. Selection tasks are considered the combination of choice and movement stages, which are traditionally modelled with Hick-Hyman and Fitts’ laws respectively. However, as the former seems to fall short in most real interactions, we propose the model based on visual search time (VST) instead. Search area size (S0), sought element size (S) and the number of alternatives (N) were elected as primary factors for VST, although vocabulary size and number of search keys are also considered. In the result of experimentation with 28 subjects of different age groups, VST was suggested as the logarithm of the ratio between S0 and S, with N not being significant. Index of selection difficulty (IDS) is proposed based on Fitts’ ID, as well as subsequent notion of selection throughput (TPS), whose mean value in the experimentation amounted to 12.6 bit/s. The models might assist in creating more usable web interfaces, justifying interface elements’ and blocks’ sizes and hierarchy and allowing the evaluation of various interface designs via selection tasks throughput performance measure.

Evaluating Interactive Information Retrieval Systems

2012

Neste artigo e proposta uma extensa metodologia para avaliar a recuperacao da informacao interativa. A proposta baseia-se em principios fundamentais apresentados na literatura de avaliacao para definir os objetivos do sistema ou ferramenta a ser avaliado, e inferir as medidas e os criterios de sucesso na consecucao dos objetivos. Propoe-se que, ao avaliar uma ferramenta de pesquisa, seja analisado em que medida ela beneficia os utilizadores, aumentando a sua capacidade de pesquisa e, consequentemente, contribuindo para a qualidade da lista de resultados. Alem da qualidade da lista de resultados, e importante avaliar ate que ponto o processo de busca e as ferramentas que o suportam atingem os seus objetivos