Small Sample Scene Categorization from Perceptual Relations (original) (raw)
Related papers
Small Sample Scene Categorization from Perceptual Relations Ilan Kadar and
2012
This paper addresses the problem of scene categorization while arguing that better and more accurate results can be obtained by endowing the computational process with perceptual relations between scene categories. We first describe a psychophysical paradigm that probes human scene categorization, extracts perceptual relations between scene categories, and suggests that these perceptual relations do not always conform the semantic structure between categories. We then incorporate the obtained perceptual findings into a computational classification scheme, which takes inter-class relationships into account to obtain better scene categorization regardless of the particular descriptors with which scenes are represented. We present such improved classification results using several popular descriptors, we discuss why the contribution of inter-class perceptual relations is particularly pronounced for under-sampled training sets, and we argue that this mechanism may explain the ability of...
O.: Small sample scene categorization from perceptual relations
2016
This paper addresses the problem of scene categoriza-tion while arguing that better and more accurate results can be obtained by endowing the computational process with perceptual relations between scene categories. We first de-scribe a psychophysical paradigm that probes human scene categorization, extracts perceptual relations between scene categories, and suggests that these perceptual relations do not always conform the semantic structure between cate-gories. We then incorporate the obtained perceptual find-ings into a computational classification scheme, which takes inter-class relationships into account to obtain better scene categorization regardless of the particular descriptors with which scenes are represented. We present such improved classification results using several popular descriptors, we discuss why the contribution of inter-class perceptual rela-tions is particularly pronounced for under-sampled train-ing sets, and we argue that this mechanism may explain the abil...
Categorization of natural scenes: Local vs. global information
Proceedings - APGV 2006: Symposium on Applied Perception in Graphics and Visualization, 2006
Understanding the robustness and rapidness of human scene categorization has been a focus of investigation in the cognitive sciences over the last decades. At the same time, progress in the area of image understanding has prompted computer vision researchers to design computational systems that are capable of automatic scene categorization. Despite these efforts, a framework describing the processes underlying human scene categorization that would enable efficient computer vision systems is still missing. In this study, we present both psychophysical and computational experiments that aim to make a further step in this direction by investigating the processing of local and global information in scene categorization. In a set of human experiments, categorization performance is tested when only local or only global image information is present. Our results suggest that humans rely on local, region-based information as much as on global, configural information. In addition, humans seem to integrate both types of information for intact scene categorization. In a set of computational experiments, human performance is compared to two state-of-the-art computer vision approaches that model either local or global information.
Visual scenes are categorized by function
Journal of experimental psychology. General, 2016
How do we know that a kitchen is a kitchen by looking? Traditional models posit that scene categorization is achieved through recognizing necessary and sufficient features and objects, yet there is little consensus about what these may be. However, scene categories should reflect how we use visual information. Therefore, we test the hypothesis that scene categories reflect functions, or the possibilities for actions within a scene. Our approach is to compare human categorization patterns with predictions made by both functions and alternative models. We collected a large-scale scene category distance matrix (5 million trials) by asking observers to simply decide whether 2 images were from the same or different categories. Using the actions from the American Time Use Survey, we mapped actions onto each scene (1.4 million trials). We found a strong relationship between ranked category distance and functional distance (r = .50, or 66% of the maximum possible correlation). The function ...
Traditional models of recognition and categorization proceed from registering low-level features, perceptually organizing that input, and linking it with stored representations. Recent evidence, however, suggests that this serial model may not be accurate, with object and category knowledge affecting rather than following early visual processing. Here, we show that the degree to which an image exemplifies its category influences how easily it is detected. Participants performed a two-alternative forced-choice task in which they indicated whether a briefly presented image was an intact or phase-scrambled scene photograph. Critically, the category of the scene is irrelevant to the detection task. We nonetheless found that participants “see” good images better, more accurately discriminating them from phase-scrambled images than bad scenes, and this advantage is apparent regardless of whether participants are asked to consider category during the experiment or not. We then demonstrate that good exemplars are more similar to same-category images than bad exemplars, influencing behavior in two ways: First, prototypical images are easier to detect, and second, intact good scenes are more likely than bad to have been primed by a previous trial.
Learning natural scene categories by selective multi-scale feature extraction
2010
a b s t r a c t Natural scene categorization from images represents a very useful task for automatic image analysis systems. In the literature, several methods have been proposed facing this issue with excellent results. Typically, features of several types are clustered so as to generate a vocabulary able to describe in a multifaceted way the considered image collection. This vocabulary is formed by a discrete set of visual codewords whose co-occurrence and/or composition allows to classify the scene category. A common drawback of these methods is that features are usually extracted from the whole image, actually disregarding whether they derive properly from the natural scene to be classified or from foreground objects, possibly present in it, which are not peculiar for the scene. As quoted by perceptual studies, objects present in an image are not useful to natural scene categorization, indeed bringing an important source of clutter, in dependence of their size.
Unsupervised Learning of Semantics of Object Detections for Scene Categorization
Advances in Intelligent Systems and Computing, 2014
Classifying scenes (e.g. into "street", "home" or "leisure") is an important but complicated task nowadays, because images come with variability, ambiguity, and a wide range of illumination or scale conditions. Standard approaches build an intermediate representation of the global image and learn classifiers on it. Recently, it has been proposed to depict an image as an aggregation of its contained objects: the representation on which classifiers are trained is composed of many heterogeneous feature vectors derived from various object detectors. In this paper, we propose to study different approaches to efficiently learn contextual semantics out of these object detections. We use the features provided by Object-Bank [24] (177 different object detectors producing 252 attributes each), and show on several benchmarks for scene categorization that careful combinations, taking into account the structure of the data, allows to greatly improve over original results (from +5 to +11 %) while drastically reducing the dimensionality of the representation by 97 % (from 44,604 to 1,000). We also show that the uncertainty relative to object detectors hampers the use of external semantic knowledge to improve detectors combination, unlike our unsupervised learning approach.
SceneNet: A Perceptual Ontology for Scene Understanding
Scene recognition systems which attempt to deal with a large number of scene categories currently lack proper knowledge about the perceptual ontology of scene categories and would enjoy significant advantage from a perceptually meaningful scene representation. In this work we perform a large-scale human study to create "SceneNet", an online ontology database for scene understanding that organizes scene categories according to their perceptual relationships. This perceptual ontology suggests that perceptual relationships do not always conform the semantic structure between categories, and it entails a lower dimensional perceptual space with "perceptually meaningful" Euclidean distance, where each embedded category is represented by a single prototype. Using the SceneNet ontology and database we derive a computational scheme for learning non-linear mapping of scene images into the perceptual space, where each scene image is closest to its category prototype than to any other prototype by a large margin. Then, we demonstrate how this approach facilitates improvements in large-scale scene categorization over state-of-the-art methods and existing semantic ontologies, and how it reveals novel perceptual findings about the discriminative power of visual attributes and the typicality of scenes.
Natural scenes categorization by hierarchical extraction of typicality patterns
2007
Natural scene categorization of images represents a very useful task for automatic image analysis systems in a wide variety of applications. In the literature, several methods have been proposed facing this issue with excellent results. Typically, features of several types are clustered so as to generate a vocabulary able to efficiently represent the considered image collection. This vocabulary is formed by a discrete set of visual codewords whose co-occurrence or composition allows to classify the scene category. A common drawback of these methods is that features are usually extracted from the whole image, actually disregarding whether they derive from the scene to be classified or other objects, independent from the scene, eventually present in it. As quoted by perceptual studies, features regarding objects present in an image are not useful to scene categorization, indeed bringing an important source of clutter, in dependence of their size. In this paper, a novel, multiscale, statistical approach for image representation aimed at scene categorization is presented. The method is able to select, at different scales, sets of features that represent exclusively the scene disregarding other non-characteristic, clutter, elements. The proposed procedure, based on a generative model, is then able to produce a robust representation scheme useful for image classification. The obtained results are very convincing and prove the goodness of the approach even by just considering simple features like local color image histograms.
Visual recognition and categorization on the basis of similarities to multiple class prototypes
1999
One of the difficulties of object recognition stems from the need to overcome the variability in object appearance caused by pose and other factors, such as illumination. The influence of these factors can be countered by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Difficulties of another kind arise in daily life situations that require categorization, rather than recognition, of objects.