Playing games as a way to improve automatic image annotation (original) (raw)

A game-based approach towards human augmented image annotation

2011

Romesh, Dharshana and Niroshan for their friendship and support during my stay in London. Finally, I thank all my family and relatives for their ongoing love and support. I am very thankful to my parents, sisters, wife and my loving uncle for their unlimited love and guidance, for always believing in me and encouraging me throughout. Words cannot express how much I love you all.

Methods for automatic and assisted image annotation

Multimedia Tools and Applications, 2011

Personal memories composed of digital pictures are very popular at the moment. To retrieve these media items annotation is required. During the last years, several approaches have been proposed in order to overcome the image annotation problem. This paper presents our proposals to address this problem. Automatic and semi-automatic learning methods for semantic concepts are presented. The automatic method is based on semantic concepts estimated using visual content, context metadata and audio information. The semi-automatic method is based on results provided by a computer game. The paper describes our proposals and presents their evaluations.

TAG4VD - A game for collaborative video annotation

2013

Creating descriptive labels for videos is an important task, with application in video retrieval, Web accessibility and computer vision. However automatic creation of such labels is difficult and, alternatively, having professionals manually describing content is too expensive. Engaging end-users in the process of describing multimedia assets may lead to good results and enables creating the sense of participation which is currently one of the key factors to attract customers to a service. The existing approaches are highly successful in terms of number of engaged players and number of collected labels, but hardly create comprehensive tag sets, contributing both with generic or too narrow meaning tags. "Games With A Purpose" are one of the approaches that have been used in an attempt to create comprehensive video descriptions by harnessing the intelligence of human players and have them contributing and collaborating towards a common goal that is recognized if successful. This paper describes a game which implements two mechanisms for collecting data via human-based computation games. Tags introduced by registered players, in a given timecode, are validated based on a collaborative scoring mechanism that eliminates irregular annotations. Additionally, a voting mechanism that enables players to endorse or refuse existing tags, provides an extra instrument to guarantee the quality of the annotations.

It's All Fun and Games until Someone Annotates: Video Games with a Purpose for Linguistic Annotation

Annotated data is prerequisite for many NLP applications. Acquiring large-scale annotated corpora is a major bottleneck, requiring significant time and resources. Recent work has proposed turning annotation into a game to increase its appeal and lower its cost; however, current games are largely text-based and closely resemble traditional annotation tasks. We propose a new linguistic annotation paradigm that produces annotations from playing graphical video games. The effectiveness of this design is demonstrated using two video games: one to create a mapping from WordNet senses to images, and a second game that performs Word Sense Disambiguation. Both games produce accurate results. The first game yields annotation quality equal to that of experts and a cost reduction of 73% over equivalent crowdsourcing; the second game provides a 16.3% improvement in accuracy over current state-of-the-art sense disambiguation games with WordNet.

LOST: A flexible framework for semi-automatic image annotation

ArXiv, 2019

State-of-the-art computer vision approaches rely on huge amounts of annotated data. The collection of such data is a time consuming process since it is mainly performed by humans. The literature shows that semi-automatic annotation approaches can significantly speed up the annotation process by the automatic generation of annotation proposals to support the annotator. In this paper we present a framework that allows for a quick and flexible design of semi-automatic annotation pipelines. We show that a good design of the process will speed up the collection of annotations. Our contribution is a new approach to image annotation that allows for the combination of different annotation tools and machine learning algorithms in one process. We further present potential applications of our approach. The source code of our framework called LOST (Label Objects and Save Time) is available at: this https URL.

CASAM: collaborative human-machine annotation of multimedia

Multimedia Tools and Applications, 2013

The CASAM multimedia annotation system implements a model of cooperative annotation between a human annotator and automated components. The aim is that they work asynchronously but together. The system focuses upon the areas where automated recognition and reasoning are most effective and the user is able to work in the areas where their unique skills are required. The system's reasoning is influenced by the annotations provided by the user and, similarly, the user can see the system's work and modify and, implicitly, direct it. The CASAM system interacts with the user by providing a window onto the current state of annotation, and by generating requests for information which are important for the final annotation or to constrain its reasoning. The user can modify the annotation, respond to requests and also add their own annotations. The objective is that the human annotator's time is used more effectively and that the result is an annotation that is both of higher quality and produced more quickly. This can be Multimed Tools Appl

A crowdsourcing approach to support video annotation

Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications, 2013

In this paper we present an innovative approach to support efficient large scale video annotation by exploiting the crowdsourcing. In particular, we collect big noisy annotations by an on-line Flash game which aims at taking photos of objects appearing through the game levels. The data gathered (suitably processed) from the game is then used to drive image segmentation approaches, namely the Region Growing and Grab Cut, which allow us to derive meaningful annotations. A comparison against hand-labeled ground truth data showed that the proposed approach constitutes a valid alternative to the existing video annotation approaches and allow a reliable and fast collection of large scale ground truth data for performance evaluation in computer vision.

Automatic Image Annotation

In the recent years, multimedia content has grown increasingly over the Internet, especially in social networks, where users often post images using their mobile devices. In these networks such as Flickr, the content is later used in search operations when some users want to find something using a specific query. Nowadays, searching into these networks is primarily made using the title and the keywords associated to resources added by users that have posted the content. The problem we face comes from the fact that in many cases, the title or the related keywords are not relevant to the resource and only after we analyse the image, can we conclude what it contains in reality. The project that we want to present in this article proposes that each image is connected to relevant keywords according to its content. In order to do this, the first step was to create a collection of images that was annotated by human annotators, while the second step was to expand this collection of images performing search on the Internet using keywords associated to the initial collection of annotated images. Currently, for a new picture, we can identify similar images in our collection of images and based on the keywords associated with them, we can determine what keywords characterize this new image. The evaluation of this system has demonstrated that our approach works efficiently for images for which we can find similar images in our collection.

Semi-Automatic Image Annotation

2001

A novel approach to semi-automatically and progressively annotating images with keywords is presented. The progressive annotation process is embedded in the course of integrated keyword-based and content-based image retrieval and user feedback. When the user submits a keyword query and then provides relevance feedback, the search keywords are automatically added to the images that receive positive feedback and can then facilitate keyword-based image retrieval in the future. The coverage and quality of image annotation in such a database system is improved progressively as the cycle of search and feedback increases. The strategy of semi-automatic image annotation is better than manual annotation in terms of efficiency and better than automatic annotation in terms of accuracy. A performance study is presented which shows that high annotation coverage can be achieved with this approach, and a preliminary user study is described showing that users view annotations as important and will likely use them in image retrieval. The user study also suggested user interface enhancements needed to support relevance feedback. We believe that similar approaches could also be applied to annotating and managing other forms of multimedia objects.

ENHANCING IMAGE ANNOTATION GAMES BY USING ASIAN WORDNET

ESP and Peekaboom games are successfully done to engage annotators in labelling the images and locating the objects in the images. The advantage of these games is making annotators feel like enjoying a game rather than working hard to label the ton of images. We extended Peekaboom game by introducing Asian WordNet (AWN) to expand the guessed label in terms of its synonym, hypernym, hyponym, meronym, and holonym. The result yields in a higher matched ratio. Moreover, AWN will also provide the semantic link between 13 Asian languages and the English language. Our method also shows a potential in creating cross-language image labelling game.