User-Centric Evaluation Framework for Multimedia Recommender Systems (original) (raw)

End-user recommendation evaluation metrics. MyMedia project deliverable D1.5

Chapter 1 discusses aspects that need to be considered for a successful online evaluation. Formative and summative evaluations have different end-goals. While formative evaluations are used to evaluate, redesign and enhance a product, summative evaluations are used to compare products against a common set of evaluation criteria. A clear understanding of the evaluation criteria is needed because otherwise it will be difficult to select the appropriate data collection method and evaluation metrics. Current literature on recommender systems lacks an understanding of what needs to be measured to assess a recommender system. Technology acceptance models, user experience models and prior research on recommender specific structural models are discussed to get a better understanding of the determinants underlying the user experience while interacting with a recommender system. Next, quantitative metrics to measure algorithm performance as well as factors related to the user experience are r...

MyMedia Project Deliverable 1.5 End-user recommendation evaluation metrics

2009

This deliverable provides an evaluation framework for the field trials planned in WP5. Chapter 1 discusses aspects that need to be considered for a successful online evaluation. Formative and summative evaluations have different end-goals. While formative evaluations are used to evaluate, redesign and enhance a product, summative evaluations are used to compare products against a common set of evaluation criteria. A clear understanding of the evaluation criteria is needed because otherwise it will be difficult to select the appropriate data collection method and evaluation metrics. Current literature on recommender systems lacks an understanding of what needs to be measured to assess a recommender system. Technology acceptance models, user experience models and prior research on recommender specific structural models are discussed to get a better understanding of the determinants underlying the user experience while interacting with a recommender system. Next, quantitative metrics to measure algorithm performance as well as factors related to the user experience are reviewed. Chapter 2 presents the proposed evaluation strategies for the MyMedia field trials. Different qualitative methods are discussed amongst others focus groups, individual interviews and user diaries. Qualitative experience measures provide a better insight in the factors underlying the user experience and are ideally complemented with quantitative measures from the usability field. Recruitment, data collection and data analysis issues are also considered. Qualitative studies are time intensive and therefore usually a small number of subjects will participate. Depending on the aim of the study quantitative studies can range from small sized samples to large samples of the populations. Chapter 3 introduces the MyMedia evaluation framework consisting of a research model and a first set of items that can measure the postulated theoretical constructs. We take the proven concept of the technology acceptance model as a starting point, but acknowledge the incompleteness of the model regarding hedonic experiences such as appeal, fun, and user emotions. The proposed evaluation framework consists of the components: objective system aspects (what the system does), subjective system aspects (how the user perceives the system), experience (what does it mean to the user: pragmatic quality, hedonic quality and trust), behavioural effect (the effect it has on the users behaviour), situational characteristics (things that matter about the user’s situation) and personal characteristics (things that matter about the user himself). The framework is generic and can easily be adapted and extended. The proposed MyMedia evaluation framework is closely linked to the evaluation aims defined by each field trial partner. Finally, a brief explanation is given about what kind and how the explicit and implicit user feedback should be collected in the field trials.

MyMedia: A software framework for recommender systems

International Broadcast Convention 2010 (IBC 2010), 2010

In the increasingly competitive world of digital content distribution an effective recommender system, able to provide appealing and personalised recommendations, can be a key advantage over competitors. The diversity of content created through the convergence of broadcast media with other forms of media delivery introduces challenges to viewers . This poster describes the work of the MyMedia collaborative research project which is exploring the use of recommender technologies for audiovisual content. Uniquely, the MyMedia project has developed an open software framework which provides all the components required to build a state-of-the-art recommender system which can be tailored to a wide range of applications. The public release consists of an overall recommender framework, several advanced recommender engines and some components for the automatic enrichment of content metadata. This paper will report the results from four comprehensive field trials, held in three different countries, in which The MyMedia framework is being evaluated: • A BT field trial based on its IPTV Video-on-Demand service. The trial will investigate viewers' response to recommendations, and the effect on purchasing behaviour • A BBC field trial in the context of its developing online TV and radio catch-up services. It will study the difficulties raised by broadcast services where there is a constant stream of new content for which there is initially no user feedback • A Microsoft field trial concerning MSN Video content. The field trial will incorporate social networking technology and explore whether recommendation services can influence personal relationships and media consumption habits. • A Microgénesis field trial involving OFCommerce, an e-commerce application for audio/visual content and music. The field trial will explore whether the recommender system improves the user experience and increases trust. The trials are being evaluated using both quantitative and qualitative studies based on objective measures, user questionnaires and interviews. This paper will also discuss the benefits and difficulties involved in introducing recommendation systems to operational environments.

Explaining the user experience of recommender systems

User Modeling and User-Adapted Interaction, 2012

Research on recommender systems typically focuses on the accuracy of prediction algorithms. Because accuracy only partially constitutes the user experience of a recommender system, this paper proposes a framework that takes a user-centric approach to recommender system evaluation. The framework links objective system aspects to objective user behavior through a series of perceptual and evaluative constructs (called subjective system aspects and experience, respectively). Furthermore, it incorporates the influence of personal and situational characteristics on the user experience. This paper reviews how current literature maps to the framework and identifies several gaps in existing work. Consequently, the framework is validated Equation Modeling. The results of these studies show that subjective system aspects and experience variables are invaluable in explaining why and how the user experience of recommender systems comes about. In all studies we observe that perceptions of recommendation quality and/or variety are important mediators in predicting the effects of objective system aspects on the three components of user experience: process (e.g. perceived effort, difficulty), system (e.g. perceived system effectiveness) and outcome (e.g. choice satisfaction). Furthermore, we find that these subjective aspects have strong and sometimes interesting behavioral correlates (e.g. reduced browsing indicates higher system effectiveness). They also show several tradeoffs between system aspects and personal and situational characteristics (e.g. the amount of preference feedback users provide is a tradeoff between perceived system usefulness and privacy concerns). These results, as well as the validated framework itself, provide a platform for future research on the user-centric evaluation of recommender systems.

MyMedia: producing an extensible framework for recommendation

2009

Users and implementers of multimedia today face a common problem: how to deal with the “crisis of choice” that exists when very many different forms of multimedia are presented to users. In such circumstances search is not a complete solution, recommendation can improve the user experience. However there are few recommender system solutions that are sufficiently versatile. This paper outlines the MyMedia software toolkit which is the outcome of an international collaboration to develop an extensible software framework for multimedia recommendation, incorporating cutting edge recommender algorithms, metadata enrichment, and software design. It will be tested in field trials under realistic conditions and has been made available to the research community as open source software.

In situ evaluation of recommender systems: Framework and instrumentation

International Journal of Human-Computer Studies, 2010

This paper deals with the evaluation of the recommendation functionality inside a connected consumer electronics product in prototype stage. This evaluation is supported by a framework to access and analyze data about product usage and user experience. The strengths of this framework lie in the collection of both objective data (i.e., ''What is the user doing with the product?'') and subjective data (i.e., ''How is the user experiencing the product?''), which are linked together and analyzed in a combined way. The analysis of objective data provides insights into how the system is actually used in the field. Combined with the subjective data, personal opinions and evaluative judgments on the product quality can be then related to actual user behavior. In order to collect these data in a most natural context, remote data collection allows for extensive user testing within habitual environments. We have applied our framework to the case of an interactive TV recommender system application to illustrate that the user experience of recommender systems can be evaluated in real-life usage scenarios.

A Characterisation and Framework for User-Centric Factors in Evaluation Methods for Recommender Systems

International Journal of ICT Research in Africa and the Middle East, 2017

Researchers have worked on-finding e-commerce recommender systems evaluation methods that contribute to an optimal solution. However, existing evaluations methods lack the assessment of user-centric factors such as buying decisions, user experience and user interactions resulting in less than optimum recommender systems. This paper investigates the problem of adequacy of recommender systems evaluation methods in relation to user-centric factors. Published work has revealed limitations of existing evaluation methods in terms of evaluating user satisfaction. This paper characterizes user-centric evaluation factors and then propose a user-centric evaluation conceptual framework to identify and expose a gap within literature. The researchers used an integrative review approach to formulate both the characterization and the conceptual framework for investigation. The results reveal a need to come up with a holistic evaluation framework that combines system-centric and user-centric evalua...

A pragmatic procedure to support the user-centric evaluation of recommender systems

2011

Abstract As recommender systems are increasingly deployed in the real world, they are not merely tested offline for precision and coverage, but also" online" with test users to ensure good user experience. The user evaluation of recommenders is however complex and resource-consuming. We introduce a pragmatic procedure to evaluate recommender systems for experience products with test users, within industry constraints on time and budget.

Evaluating a Recommendation Application for Online Video Content: An Interdisciplinary Study

Proceedings of the 8th international interactive conference on Interactive TV&Video - EuroITV '10, 2010

In this paper, we discuss the set-up and results from an interdisciplinary study aimed at evaluating a recommendation application for online video content, called PersonalTV. By involving (possible) users (i.e. a panel of test users), we tried to gather insights that might help to optimize and refine the application. In this respect, implicit and explicit user feedback were complemented. This paper explores the relation between the PersonalTV suggestions (recommended content) and the consumption percentage (objective data) (RQ 1) and between the recommended content and the reported satisfaction (subjective data) (RQ 2) of the test users. We also investigated whether the objective and subjective measures converge (RQ 3) and collected feedback that suggests measures for further improvement and optimization of the application.