Jon Chamberlain | University of Essex (original) (raw)

Papers by Jon Chamberlain

JMIR medical informatics, Jan 2, 2017

Healthcare information professionals play a key role in closing the knowledge gap between medical... more Healthcare information professionals play a key role in closing the knowledge gap between medical research and clinical practice. Their work involves meticulous searching of literature databases using complex search strategies that can consist of hundreds of keywords, operators, and ontology terms. This process is prone to error and can lead to inefficiency and bias if performed incorrectly. The aim of this study was to investigate the search behavior of healthcare information professionals, uncovering their needs, goals, and requirements for information retrieval systems. A survey was distributed to healthcare information professionals via professional association email discussion lists. It investigated the search tasks they undertake, their techniques for search strategy formulation, their approaches to evaluating search results, and their preferred functionality for searching library-style databases. The popular literature search system PubMed was then evaluated to determine the ...

Business Information Review, 2016

Lecture Notes in Computer Science, 2016

Second Aaai Conference on Human Computation and Crowdsourcing, May 9, 2014

ABSTRACT Increasingly social networks are being used for citizen science, where members of the pu... more ABSTRACT Increasingly social networks are being used for citizen science, where members of the public contribute knowledge to scientific endeavours. Tasks can be presented and solved using human computation, termed group-sourcing, with users benefiting from community tuition and experts gaining knowledge from the crowd. This paper gives details of a prototype that utilises group-sourcing to solve image classification tasks, to support social learning and to facilitate knowledge discovery in the domain of marine biology.

Handbook of Human Computation, 2013

Proceedings of the First International Workshop on Gamification for Information Retrieval - GamifIR '14, 2014

ABSTRACT Evaluating contributions from users of systems with large datasets is a challenge across... more ABSTRACT Evaluating contributions from users of systems with large datasets is a challenge across many domains, from task assessment in crowdsourcing to document relevance in information retrieval. This paper introduces a model for rewarding and evaluating users using retrospective validation, with only a small gold standard required to initiate the system. A simulation of the model shows that users are rewarded appropriately for high quality responses however analysis of data from an implementation of the model in a text annotation game indicates it may not be sophisticated enough to predict user performance.

One of the more novel approaches to collaboratively creating language resources in recent years i... more One of the more novel approaches to collaboratively creating language resources in recent years is to use online games to collect and validate data. The most significant challenges collaborative systems face are how to train users with the necessary expertise and how to encourage participation on a scale required to produce high quality data comparable with data produced by “traditional” experts. In this chapter we provide a brief overview of collaborative creation and the different approaches that have been used to ...

ACM Transactions on Interactive Intelligent Systems (TiiS), 2013

We are witnessing a paradigm shift in Human Language Technology (HLT) that may well have an impac... more We are witnessing a paradigm shift in Human Language Technology (HLT) that may well have an impact on the field comparable to the statistical revolution: acquiring large-scale resources by exploiting collective intelligence. An illustration of this new approach is Phrase Detectives, an interactive online game with a purpose for creating anaphorically annotated resources that makes use of a highly distributed population of contributors with different levels of expertise.

Collaborative Resource Development and Delivery Workshop Programme

The Phrase Detectives Game-With-A-Purpose for anaphoric annotation has been live since December 2... more The Phrase Detectives Game-With-A-Purpose for anaphoric annotation has been live since December 2008, collecting over 2.5 million judgments on the anaphoric expressions in texts in two languages (English and Italian) from around 9,000 players. In this paper we summarize our recent work on creating a corpus using these annotations.

Crowdsourcing and citizen science have established themselves in the mainstream of research metho... more Crowdsourcing and citizen science have established themselves in the mainstream of research methodology in recent years, employing a variety of methods to solve problems using human computation. An approach described here, termed "groupsourcing", uses social networks to present problems and collect solutions. This paper details a method for archiving social network messages and investigates messages containing an image classification task in the domain of marine biology. In comparison to other methods, groupsourcing offers a high accuracy, data-driven and low cost approach.

When attempting to analyse and improve a system interface it is often the performance of system u... more When attempting to analyse and improve a system interface it is often the performance of system users that measures the success of different iterations of design. This paper investigates the importance of sensory and cognitive stages in human data processing, using data collected from Phrase Detectives, a text-based game for collecting language data, and discusses its application for interface design.

Evaluating contributions from users of systems with large datasets is a challenge across many dom... more Evaluating contributions from users of systems with large datasets is a challenge across many domains, from task assessment in crowdsourcing to document relevance in information retrieval. This paper introduces a model for rewarding and evaluating users using retrospective validation, with only a small gold standard required to initiate the system. A simulation of the model shows that users are rewarded appropriately for high quality responses however analysis of data from an implementation of the model in a text annotation game indicates it may not be sophisticated enough to predict user performance.

One of the most significant challenges facing systems of collective intelligence is how to encour... more One of the most significant challenges facing systems of collective intelligence is how to encourage participation on the scale required to produce high quality data. This paper details ongoing work with Phrase Detectives, an online game-with-a-purpose deployed on Facebook, and investigates user motivations for participation in social network gaming where the wisdom of crowds produces useful data. 1

Despite the impressive progress made in recent years in all areas of natural language processing ... more Despite the impressive progress made in recent years in all areas of natural language processing there are still tasks that do not perform well enough to be used in everyday applications. One example is anaphora resolution. The most promising approach to get significant improvements in this area is to create sufficiently large linguistically annotated resources which can then be used to train, for example, machine learning systems. Annotated corpora of the size needed for modern computational linguistics research cannot however be created by small groups of handannotators; but ESP and similar games have demonstrated how it might be possible to do this through Web collaboration. This paper reports on the ongoing work on Phrase Detectives, a game developed in the ANAWIKI project designed for collaborative linguistic annotation on the Web. Of particular concern here are the measures that assure high-quality annotations.

… , and Processing of Text …, Jan 1, 2011

Modern NLP systems rely either on unsupervised methods, or on data created as part of governmenta... more Modern NLP systems rely either on unsupervised methods, or on data created as part of governmental initiatives such as MUC, ACE, or GALE. The data created in these efforts tend to be annotated according to task-specific schemes. The Anaphoric Bank is an attempt to create large quantities of data annotated with anaphoric information according to a general purpose and linguistically motivated scheme. We do this by pooling smaller amounts of data annotated according to rich schemes that are by and large compatible, and by taking ...

Proceedings of the …, Jan 1, 2008

Large-scale linguistically annotated resources have become available in recent years. This is par... more Large-scale linguistically annotated resources have become available in recent years. This is partly due to sophisticated automatic and semiautomatic approaches that work well on specific tasks such as part-ofspeech tagging. For more complex linguistic phenomena like anaphora resolution there are no tools that result in high-quality annotations without massive user intervention. Annotated corpora of the size needed for modern computational linguistics research cannot however be created by small groups of hand annotators. The ANAWIKI project strikes a balance between collecting high-quality annotations from experts and applying a game-like approach to collecting linguistic annotation from the general Web population. More generally, ANAWIKI is a project that explores to what extend expert annotations can be substituted by a critical mass of non-expert judgements.

Proceedings of the …, Jan 1, 2009

He's passed on! This parrot is no more! He has ceased to be! He's expired and gone to meet his ma... more He's passed on! This parrot is no more! He has ceased to be! He's expired and gone to meet his maker! He's kicked the bucket, he's shuffled off his mortal coil, run down the curtain and joined the bleedin' choir invisibile! THIS IS AN EX-PARROT! 1 ABSTRACT In order for there to be significant improvements in certain areas of natural language processing (such as anaphora resolution) large linguistically annotated resources need to be created which can be used to train, for example, machine learning systems. Annotated corpora of the size needed for modern computational linguistics research cannot however be created by small groups of hand-annotators. Simple Webbased games have demonstrated how it might be possible to do this through Web collaboration. This paper reports on the ongoing work of Phrase Detectives, a game developed in the ANAWIKI project designed for collaborative linguistic annotation on the Web. In this paper we focus on how we recruit and motivate players, incentivise high quality annotations and assess the quality of the data.

Proceedings of the ACM …, Jan 1, 2009

The goal of the ANAWIKI project is to experiment with Web collaboration and human computation to ... more The goal of the ANAWIKI project is to experiment with Web collaboration and human computation to create largescale linguistically annotated corpora. We will present ongoing work and initial results of Phrase Detectives, a game designed to collect judgments about anaphoric annotations.

JMIR medical informatics, Jan 2, 2017

Business Information Review, 2016

Lecture Notes in Computer Science, 2016

Second Aaai Conference on Human Computation and Crowdsourcing, May 9, 2014

Handbook of Human Computation, 2013

Proceedings of the First International Workshop on Gamification for Information Retrieval - GamifIR '14, 2014

ACM Transactions on Interactive Intelligent Systems (TiiS), 2013

Collaborative Resource Development and Delivery Workshop Programme

… , and Processing of Text …, Jan 1, 2011

Proceedings of the …, Jan 1, 2008

Proceedings of the …, Jan 1, 2009

Proceedings of the ACM …, Jan 1, 2009

Marine ecosystems are complex networks of interactions between communities of species. By modelli... more Marine ecosystems are complex networks of interactions between communities of species. By modelling the networks it is possible to predict how vulnerable communities are to changes, such as the loss of a key species. The degree to which a species can adapt its interactions within a community, termed plasticity, greatly increases its chance of survival, and the survival of the entire system, during periods of change. However our understanding of species interactions in the traditional literature is based on limited observations and a better estimation of plasticity could be achieved by processing more sources of information.

This research will use state-of-the-art text processing techniques to extract the data required to model these networks from existing sources of information about marine species, including traditional document collections and the Internet, and stored in a relational database as tuples e.g. eats(Nembrotha kubaryana, Sigillina signifera). Interactions will be evaluated by the confidence of correct classification and the credibility of the source, as well as providing full referencing. Once the data is in a structured form it can be queried to provide actionable knowledge.

We hypothesis that some species are more plastic than traditional literature would predict and that their presence in a changing ecosystem increases the resilience of the community as a whole. This can be answered, to some extent, by looking at the range of conditions a species can exist and the interactions it has within different communities. This, of course, would be an indicator that would require further robust research.

This research is not intended to undermine the conservation message but more to improve ecosystem modelling and focus on protecting systems vulnerable to collapse.

A demonstration of how the interactions of marine species can be visualised and explored is here: http://www.jonchamberlain.com/marine_interaction.php.