Stefan Rueger - Academia.edu (original) (raw)
Papers by Stefan Rueger
Lecture Notes in Computer Science
Visual resource discovery modes are discussed with a view to apply them in a wide variety of digi... more Visual resource discovery modes are discussed with a view to apply them in a wide variety of digital multimedia collections. The paradigms include summarising complex multimedia objects such as TV news, information visualisation techniques for document clusters, visual search by example, relevance feedback and methods to create browsable structures within the collection. These exploration modes share three common features: they
Lecture Notes in Computer Science, 2005
Automated image annotation has arisen as a recent alternative to querying databases of natural im... more Automated image annotation has arisen as a recent alternative to querying databases of natural images directly by image content, with the benefit that the content of a desired image can often be specified most conveniently with keywords or natural language. Such a facility can be ...
Proceedings of the 33rd International Acm Sigir Conference on Research and Development in Information Retrieval, 2010
... CONTENTS xi 5.4.3 INEX XML Multimedia Track 124 5.4.4 TRECVid 124 5.4.5 ImageCLEF 126 5.4 ...... more ... CONTENTS xi 5.4.3 INEX XML Multimedia Track 124 5.4.4 TRECVid 124 5.4.5 ImageCLEF 126 5.4 ... artwork that is not otherwise protected (see page 135) provided you acknowledge the source as Stefan Rüger (2010). ... are the wild oxen 4. And with you I did not 5. In our city 6. In ...
Lecture Notes in Computer Science, 2000
This paper presents a new approach to the problem of feature weighting for content based image re... more This paper presents a new approach to the problem of feature weighting for content based image retrieval. If a query image admits to multiple interpretations, user feedback on the set of returned images can be an effective tool to improve retrieval performance in subsequent rounds. For this to work, however, the first results set has to include representatives of the semantic facet of interest. We will argue that relevance feedback techniques that fix the distance metric for the first retrieval round are semantically biased and may fail to distil relevant semantic facets thus limiting the scope of relevance feedback. Our approach is based on the notion of the NN k of a query image, defined as the set of images that are nearest neighbours of the query under some instantiation of a parametrised distance metric. Different neighbours may be viewed as representing different meanings of the query. By associating each NN k with the parameters for which it was ranked closest to the query, the selection of relevant NN k by a user provides us with parameters for the second retrieval round. We evaluate this two step relevance feedback technique on two collections and compare it to an alternative relevance feedback method and to an oracle for which the optimal parameter values are known.
Lecture Notes in Computer Science, 2002
... Retrieval A Comparative Evaluation of ... First, in contrast to a fully-fledged image that ... more ... Retrieval A Comparative Evaluation of ... First, in contrast to a fully-fledged image that often requiressegmentation to separate local objects from their ... very powerful shape representations, their computational costs make them ill-suited for the purpose of interactive retrieval of ...
Proceedings of the 12th European Conference on Visual Media Production - CVMP '15, 2015
Lecture Notes in Computer Science, 2004
In this paper, we explore approaches to multilingual information retrieval for Greek, Latin, and ... more In this paper, we explore approaches to multilingual information retrieval for Greek, Latin, and Old Norse texts. We also describe an information retrieval tool that allows users to formulate Greek, Latin, or Old Norse queries in English and display the results in an innovative clustering and visualization facility.
Lecture Notes in Computer Science, 2004
This paper describes interfaces for a suite of three recently developed techniques to facilitate ... more This paper describes interfaces for a suite of three recently developed techniques to facilitate content-based access to large image and video repositories. Two of these techniques involve content-based retrieval while the third technique is centered around a new browsing structure and forms a useful complement to the traditional query-byexample paradigm. Each technique is associated with its own user interface and allows for a different set of user interactions. The user can move between interfaces whilst executing a particular search and thus may combine the particular strengths of the different techniques. We illustrate each of the techniques using topics from the TRECVID 2003 contest.
2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004
This paper reports on experimental results obtained from a performance comparison of feature comb... more This paper reports on experimental results obtained from a performance comparison of feature combinations strategies in content based image retrieval. The use of Support Vector Machines is compared to CombMIN, CombMAX, CombSUM and BordaFuse combination strategies, all of which are evaluated on a carefully compiled set of Corel images and the TRECVID 2003 search task collection.
Proceedings of the ACM International Conference on Image and Video Retrieval - CIVR '10, 2010
For guidance on citations see FAQs.
We describe our experiments for the search task. Eight runs were submitted, all of them correspon... more We describe our experiments for the search task. Eight runs were submitted, all of them corresponding to the fully automated mode, without human interaction in the loop. The system was based on determining the distance from a query image to a pre-indexed collection of images to build a list of results ordered by similarity. We used four different metric measures and two different data normalisation approaches in our runs. We found that the results for all of the runs roughly match the median results achieved in this year's competition.
Proceedings of the ACM International Conference on Image and Video Retrieval - CIVR '10, 2010
Abstract In this paper, we explore different ways of formulating new evaluation measures for mult... more Abstract In this paper, we explore different ways of formulating new evaluation measures for multi-label image classification when the vocabulary of the collection adopts the hierarchical structure of an ontology. We apply several semantic relatedness measures based on web-search engines, WordNet, Wikipedia and Flickr to the ontology-based score (OS) proposed in [22]. The final objective is to assess the benefit of integrating semantic distances to the OS measure. Hence, we have evaluated them in a real case scenario: the results (73 runs) ...
Lecture Notes in Computer Science, 2008
This version may not include final proof corrections and does not include published layout or pag... more This version may not include final proof corrections and does not include published layout or pagination.
Lecture Notes in Computer Science, 2009
Lecture Notes in Computer Science, 2004
We have carried out a detailed evaluation of the use of texture features in a query-by-example ap... more We have carried out a detailed evaluation of the use of texture features in a query-by-example approach to image retrieval. We used 3 radically different texture feature types motivated by i) statistical, ii) psychological and iii) signal processing points of view. The features were evaluated and tested on retrieval tasks from the Corel and TRECVID2003 image collections. For the latter we also looked at the effects of combining texture features with a colour feature.
Lecture Notes in Computer Science, 2005
We have applied the concept of fractional distance measures, proposed by Aggarwal et al. [1], to ... more We have applied the concept of fractional distance measures, proposed by Aggarwal et al. [1], to content-based image retrieval. Our experiments show that retrieval performances of these measures consistently outperform the more usual Manhattan and Euclidean distance metrics when used with a wide range of high-dimensional visual features. We used the parameters learnt from a Corel dataset on a variety of different collections, including the TRECVID 2003 and ImageCLEF 2004 datasets. We found that the specific optimum parameters varied but the general performance increase was consistent across all 3 collections. To squeeze the last bit of performance out of a system it would be necessary to train a distance measure for a specific collection. However, a fractional distance measure with parameter p = 0.5 will consistently outperform both L1 and L2 norms.
Proceedings of the international conference on Multimedia information retrieval - MIR '10, 2010
The creation of golden standard datasets is a costly business. Optimally more than one judgment p... more The creation of golden standard datasets is a costly business. Optimally more than one judgment per document is obtained to ensure a high quality on annotations. In this context, we explore how much annotations from experts differ from each other, how different sets of annotations influence the ranking of systems and if these annotations can be obtained with a crowdsourcing approach. This study is applied to annotations of images with multiple concepts. A subset of the images employed in the latest ImageCLEF Photo Annotation competition was manually annotated by expert annotators and non-experts with Mechanical Turk. The inter-annotator agreement is computed at an image-based and concept-based level using majority vote, accuracy and kappa statistics. Further, the Kendall τ and Kolmogorov-Smirnov correlation test is used to compare the ranking of systems regarding different ground-truths and different evaluation measures in a benchmark scenario. Results show that while the agreement between experts and non-experts varies depending on the measure used, its influence on the ranked lists of the systems is rather small. To sum up, the majority vote applied to generate one annotation set out of several opinions, is able to filter noisy judgments of non-experts to some extent. The resulting annotation set is of comparable quality to the annotations of experts.
Lecture Notes in Computer Science, 2004
This paper describes a novel interaction technique to support content-based image search in large... more This paper describes a novel interaction technique to support content-based image search in large image collections. The idea is to represent each image as a vertex in a directed graph. Given a set of image features, an arc is established between two images if there exists at least one combination of features for which one image is retrieved as the nearest neighbour of the other. Each arc is weighted by the proportion of feature combinations for which the nearest neighour relationship holds. By thus integrating the retrieval results over all possible feature combinations, the resulting network helps expose the semantic richness of images and thus provides an elegant solution to the problem of feature weighting in content-based image retrieval. We give details of the method used for network generation and describe the ways a user can interact with the structure. We also provide an analysis of the network's topology and provide quantitative evidence for the usefulness of the technique.
Lecture Notes in Computer Science, 2005
Given a collection of images and a set of image features, we can build what we have previously te... more Given a collection of images and a set of image features, we can build what we have previously termed NN k networks by representing images as vertices of the network and by establishing arcs between any two images if and only if one is most similar to the other for some weighted combination of features. An earlier analysis of its structural properties revealed that the networks exhibit small-world properties, that is a small distance between any two vertices and a high degree of local structure. This paper extends our analysis. In order to provide a theoretical explanation of its remarkable properties, we investigate explicitly how images belonging to the same semantic class are distributed across the network. Images of the same class correspond to subgraphs of the network. We propose and motivate three topological properties which we expect these subgraphs to possess and which can be thought of as measures of their compactness. Measurements of these properties on two collections indicate that these subgraphs tend indeed to be highly compact.
ACM SIGIR Forum, 2015
For guidance on citations see FAQs.
Lecture Notes in Computer Science
Visual resource discovery modes are discussed with a view to apply them in a wide variety of digi... more Visual resource discovery modes are discussed with a view to apply them in a wide variety of digital multimedia collections. The paradigms include summarising complex multimedia objects such as TV news, information visualisation techniques for document clusters, visual search by example, relevance feedback and methods to create browsable structures within the collection. These exploration modes share three common features: they
Lecture Notes in Computer Science, 2005
Automated image annotation has arisen as a recent alternative to querying databases of natural im... more Automated image annotation has arisen as a recent alternative to querying databases of natural images directly by image content, with the benefit that the content of a desired image can often be specified most conveniently with keywords or natural language. Such a facility can be ...
Proceedings of the 33rd International Acm Sigir Conference on Research and Development in Information Retrieval, 2010
... CONTENTS xi 5.4.3 INEX XML Multimedia Track 124 5.4.4 TRECVid 124 5.4.5 ImageCLEF 126 5.4 ...... more ... CONTENTS xi 5.4.3 INEX XML Multimedia Track 124 5.4.4 TRECVid 124 5.4.5 ImageCLEF 126 5.4 ... artwork that is not otherwise protected (see page 135) provided you acknowledge the source as Stefan Rüger (2010). ... are the wild oxen 4. And with you I did not 5. In our city 6. In ...
Lecture Notes in Computer Science, 2000
This paper presents a new approach to the problem of feature weighting for content based image re... more This paper presents a new approach to the problem of feature weighting for content based image retrieval. If a query image admits to multiple interpretations, user feedback on the set of returned images can be an effective tool to improve retrieval performance in subsequent rounds. For this to work, however, the first results set has to include representatives of the semantic facet of interest. We will argue that relevance feedback techniques that fix the distance metric for the first retrieval round are semantically biased and may fail to distil relevant semantic facets thus limiting the scope of relevance feedback. Our approach is based on the notion of the NN k of a query image, defined as the set of images that are nearest neighbours of the query under some instantiation of a parametrised distance metric. Different neighbours may be viewed as representing different meanings of the query. By associating each NN k with the parameters for which it was ranked closest to the query, the selection of relevant NN k by a user provides us with parameters for the second retrieval round. We evaluate this two step relevance feedback technique on two collections and compare it to an alternative relevance feedback method and to an oracle for which the optimal parameter values are known.
Lecture Notes in Computer Science, 2002
... Retrieval A Comparative Evaluation of ... First, in contrast to a fully-fledged image that ... more ... Retrieval A Comparative Evaluation of ... First, in contrast to a fully-fledged image that often requiressegmentation to separate local objects from their ... very powerful shape representations, their computational costs make them ill-suited for the purpose of interactive retrieval of ...
Proceedings of the 12th European Conference on Visual Media Production - CVMP '15, 2015
Lecture Notes in Computer Science, 2004
In this paper, we explore approaches to multilingual information retrieval for Greek, Latin, and ... more In this paper, we explore approaches to multilingual information retrieval for Greek, Latin, and Old Norse texts. We also describe an information retrieval tool that allows users to formulate Greek, Latin, or Old Norse queries in English and display the results in an innovative clustering and visualization facility.
Lecture Notes in Computer Science, 2004
This paper describes interfaces for a suite of three recently developed techniques to facilitate ... more This paper describes interfaces for a suite of three recently developed techniques to facilitate content-based access to large image and video repositories. Two of these techniques involve content-based retrieval while the third technique is centered around a new browsing structure and forms a useful complement to the traditional query-byexample paradigm. Each technique is associated with its own user interface and allows for a different set of user interactions. The user can move between interfaces whilst executing a particular search and thus may combine the particular strengths of the different techniques. We illustrate each of the techniques using topics from the TRECVID 2003 contest.
2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004
This paper reports on experimental results obtained from a performance comparison of feature comb... more This paper reports on experimental results obtained from a performance comparison of feature combinations strategies in content based image retrieval. The use of Support Vector Machines is compared to CombMIN, CombMAX, CombSUM and BordaFuse combination strategies, all of which are evaluated on a carefully compiled set of Corel images and the TRECVID 2003 search task collection.
Proceedings of the ACM International Conference on Image and Video Retrieval - CIVR '10, 2010
For guidance on citations see FAQs.
We describe our experiments for the search task. Eight runs were submitted, all of them correspon... more We describe our experiments for the search task. Eight runs were submitted, all of them corresponding to the fully automated mode, without human interaction in the loop. The system was based on determining the distance from a query image to a pre-indexed collection of images to build a list of results ordered by similarity. We used four different metric measures and two different data normalisation approaches in our runs. We found that the results for all of the runs roughly match the median results achieved in this year's competition.
Proceedings of the ACM International Conference on Image and Video Retrieval - CIVR '10, 2010
Abstract In this paper, we explore different ways of formulating new evaluation measures for mult... more Abstract In this paper, we explore different ways of formulating new evaluation measures for multi-label image classification when the vocabulary of the collection adopts the hierarchical structure of an ontology. We apply several semantic relatedness measures based on web-search engines, WordNet, Wikipedia and Flickr to the ontology-based score (OS) proposed in [22]. The final objective is to assess the benefit of integrating semantic distances to the OS measure. Hence, we have evaluated them in a real case scenario: the results (73 runs) ...
Lecture Notes in Computer Science, 2008
This version may not include final proof corrections and does not include published layout or pag... more This version may not include final proof corrections and does not include published layout or pagination.
Lecture Notes in Computer Science, 2009
Lecture Notes in Computer Science, 2004
We have carried out a detailed evaluation of the use of texture features in a query-by-example ap... more We have carried out a detailed evaluation of the use of texture features in a query-by-example approach to image retrieval. We used 3 radically different texture feature types motivated by i) statistical, ii) psychological and iii) signal processing points of view. The features were evaluated and tested on retrieval tasks from the Corel and TRECVID2003 image collections. For the latter we also looked at the effects of combining texture features with a colour feature.
Lecture Notes in Computer Science, 2005
We have applied the concept of fractional distance measures, proposed by Aggarwal et al. [1], to ... more We have applied the concept of fractional distance measures, proposed by Aggarwal et al. [1], to content-based image retrieval. Our experiments show that retrieval performances of these measures consistently outperform the more usual Manhattan and Euclidean distance metrics when used with a wide range of high-dimensional visual features. We used the parameters learnt from a Corel dataset on a variety of different collections, including the TRECVID 2003 and ImageCLEF 2004 datasets. We found that the specific optimum parameters varied but the general performance increase was consistent across all 3 collections. To squeeze the last bit of performance out of a system it would be necessary to train a distance measure for a specific collection. However, a fractional distance measure with parameter p = 0.5 will consistently outperform both L1 and L2 norms.
Proceedings of the international conference on Multimedia information retrieval - MIR '10, 2010
The creation of golden standard datasets is a costly business. Optimally more than one judgment p... more The creation of golden standard datasets is a costly business. Optimally more than one judgment per document is obtained to ensure a high quality on annotations. In this context, we explore how much annotations from experts differ from each other, how different sets of annotations influence the ranking of systems and if these annotations can be obtained with a crowdsourcing approach. This study is applied to annotations of images with multiple concepts. A subset of the images employed in the latest ImageCLEF Photo Annotation competition was manually annotated by expert annotators and non-experts with Mechanical Turk. The inter-annotator agreement is computed at an image-based and concept-based level using majority vote, accuracy and kappa statistics. Further, the Kendall τ and Kolmogorov-Smirnov correlation test is used to compare the ranking of systems regarding different ground-truths and different evaluation measures in a benchmark scenario. Results show that while the agreement between experts and non-experts varies depending on the measure used, its influence on the ranked lists of the systems is rather small. To sum up, the majority vote applied to generate one annotation set out of several opinions, is able to filter noisy judgments of non-experts to some extent. The resulting annotation set is of comparable quality to the annotations of experts.
Lecture Notes in Computer Science, 2004
This paper describes a novel interaction technique to support content-based image search in large... more This paper describes a novel interaction technique to support content-based image search in large image collections. The idea is to represent each image as a vertex in a directed graph. Given a set of image features, an arc is established between two images if there exists at least one combination of features for which one image is retrieved as the nearest neighbour of the other. Each arc is weighted by the proportion of feature combinations for which the nearest neighour relationship holds. By thus integrating the retrieval results over all possible feature combinations, the resulting network helps expose the semantic richness of images and thus provides an elegant solution to the problem of feature weighting in content-based image retrieval. We give details of the method used for network generation and describe the ways a user can interact with the structure. We also provide an analysis of the network's topology and provide quantitative evidence for the usefulness of the technique.
Lecture Notes in Computer Science, 2005
Given a collection of images and a set of image features, we can build what we have previously te... more Given a collection of images and a set of image features, we can build what we have previously termed NN k networks by representing images as vertices of the network and by establishing arcs between any two images if and only if one is most similar to the other for some weighted combination of features. An earlier analysis of its structural properties revealed that the networks exhibit small-world properties, that is a small distance between any two vertices and a high degree of local structure. This paper extends our analysis. In order to provide a theoretical explanation of its remarkable properties, we investigate explicitly how images belonging to the same semantic class are distributed across the network. Images of the same class correspond to subgraphs of the network. We propose and motivate three topological properties which we expect these subgraphs to possess and which can be thought of as measures of their compactness. Measurements of these properties on two collections indicate that these subgraphs tend indeed to be highly compact.
ACM SIGIR Forum, 2015
For guidance on citations see FAQs.