Sven Dickinson | University of Toronto (original) (raw)
Papers by Sven Dickinson
Abstract We present a model-based framework for incremental, adaptive object shape estimation and... more Abstract We present a model-based framework for incremental, adaptive object shape estimation and tracking in monocular image sequences. Parametric structure and motion estimation methods usually assume a fixed class of shape representation (splines, deformable superquadrics, etc.) that is initialized prior to tracking. Since the model shape coverage is fixed a priori, the incremental recovery of structure is decoupled from tracking, thereby limiting both processes in their scope and robustness.
Abstract The recognition community has typically avoided bridging the representational gap betwee... more Abstract The recognition community has typically avoided bridging the representational gap between traditional, low-level image features and generic models. Instead, the gap has been artificially eliminated by either bringing the image closer to the models using simple scenes containing idealized, textureless objects or by bringing the models closer to the images using 3D CAD model templates or 2D appearance model templates.
abstract A blackboard model of problem solving is applied in the design of a vision system by whi... more abstract A blackboard model of problem solving is applied in the design of a vision system by which an autonomous land vehicle (ALV) navigates roads. The ALV vision task consists of hypothesizing objects in a scene model and verifying these hypotheses using the vehicle's sensors. Object hypothesis generation is based on an a priori map, a planned route through the map, and the current state of the scene model.
ABSTRACT We propose a computer vision system, called a ���visual supervisor���, capable of visual... more ABSTRACT We propose a computer vision system, called a ���visual supervisor���, capable of visually monitoring an environment with the goal of safety monitoring and task verification. We demonstrate our system in a spaceborne setting in which from monocular video sequences, it 1) tracks and recognizes objects, 2) detects workspace violations, and 3) supervises in-progress tasks for anomalous situations, failures, or satisfactory progress.
Detecting contour closure, ie, finding a cycle of disconnected contour fragments that separates a... more Detecting contour closure, ie, finding a cycle of disconnected contour fragments that separates an object from its background, is an important problem in perceptual grouping. Searching the entire space of possible groupings is intractable, and previous approaches have adopted powerful perceptual grouping heuristics, such as proximity and co-curvilinearity, to manage the search.
We have been developing a theory for the generic representation of 2-D shape, where structural de... more We have been developing a theory for the generic representation of 2-D shape, where structural descriptions are derived from the shocks (singularities) of a curve evolution process, acting on bounding contours. We now apply the theory to the problem of shape matching. The shocks are organized into a directed, acyclic shock graph, and complexity is managed by attending to the most significant (central) shape components first.
Abstract We quantify the observation by Kender and Freudenstein (1987) that degenerate views occu... more Abstract We quantify the observation by Kender and Freudenstein (1987) that degenerate views occupy a significant fraction of the viewing sphere surrounding an object. This demonstrates that systems for recognition must explicitly account for the possibility of view degeneracy. We show that view degeneracy cannot be detected from a single camera viewpoint.
Prior to the advent of appearance-based recognition in the early 1990's, object categorization re... more Prior to the advent of appearance-based recognition in the early 1990's, object categorization researchers modeled the prototypical shape of an object, seeking models that were invariant to changes in color, texture, and minor within-class shape deformation. While these categorical models were well-motivated, they could not be reliably recovered from real images of real objects, and eventually gave way to models based on recurring, local, appearance-based features.
Abstract We describe a robot vision system that achieves complex object recognition with two laye... more Abstract We describe a robot vision system that achieves complex object recognition with two layers of behaviors, performing the tasks of planning and object recognition, respectively. The recognition layer is a pipeline in which successive stages take in images from a stereo head, recover relevant features, build intermediate representations, and deposit 3-D objects into a world model. Each stage is an independent process that reacts automatically to output from the previous stage.
Abstract In this paper, we track changes in the appearance of the object as it moves from one fra... more Abstract In this paper, we track changes in the appearance of the object as it moves from one frame to the next. At a symbolic level, an aspect graph clusters all the views of an object into a set of topologically distinct classes in terms of which surfaces of an object are visible from a given viewpoint (Koenderink and van Doom (1979).
Abstract We present an approach to the top-down recognition of volumetric part-based 3-D objects ... more Abstract We present an approach to the top-down recognition of volumetric part-based 3-D objects from a single 2-D image. In previous work 11, 10], we focused on the unexpected object recognition problem and required a complete and consistent interpretation of the image, in terms of a set of 3-D volumetric parts, before matching recovered parts to object models. This requirement was both time consuming and ambitious, particularly with less constrained images of scenes containing less constrained objects.
We present an algorithm for automatically constructing a decompositional shape model from example... more We present an algorithm for automatically constructing a decompositional shape model from examples. Unlike current approaches to structural model acquisition, in which one-to-one correspondences among appearance-based features are used to construct an exemplar-based model, we search for many-to-many correspondences among qualitative shape features (multi-scale ridges and blobs) to construct a generic shape model.
Abstract The mainstream object categorisation community relies heavily on object representations ... more Abstract The mainstream object categorisation community relies heavily on object representations consisting of local image features, due to their ease of recovery and their attractive invariance properties. Object categorisation is therefore formulated as finding, that is,detecting??, a one-to-one correspondence between image and model features. This assumption breaks down for categories in which two exemplars may not share a single local image feature.
In previous work, singular points (or top points) in the scale space representation of generic im... more In previous work, singular points (or top points) in the scale space representation of generic images have proven valuable for image matching. In this paper, we propose a construction that encodes the scale space description of top points in the form of a directed acyclic graph. This representation allows us to utilize graph matching algorithms for comparing images represented in terms of top point configurations instead of using solely the top points and their features in a point matching algorithm, as was done previously.
The problem of segmenting image sequences based on 2D motion has been under study for many years ... more The problem of segmenting image sequences based on 2D motion has been under study for many years now. Most early approaches were either region-based, doing some sort of robust motion estimation, or boundary-based, preferring instead to track the bounding contours of the moving image region. In this paper, we explore an approach based on a synergy between these two previous approaches.
Abstract We consider the use of medial surfaces to represent symmetries of 3-D objects. This allo... more Abstract We consider the use of medial surfaces to represent symmetries of 3-D objects. This allows for a qualitative abstraction based on a directed acyclic graph of components and also a degree of invariance to a variety of transformations including the articulation of parts. We demonstrate the use of this representation for 3-D object model retrieval. Our formulation uses the geometric information associated with each node along with an eigenvalue labeling of the adjacency matrix of the subgraph rooted at that node.
In this paper, we propose a new approach to learn structured visual compound models from shape-ba... more In this paper, we propose a new approach to learn structured visual compound models from shape-based feature descriptions. We use captioned text in order to drive the process of grouping boundary fragments detected in an image. In the learning framework, we transfer several techniques from computational linguistics to the visual domain and build on previous work in image annotation. A statistical translation model is used in order to establish links between caption words and image elements.
ABSTRACT In this paper, we outline the design and development of a clinical radiograph image data... more ABSTRACT In this paper, we outline the design and development of a clinical radiograph image database to be indexed by image content. User-identi ed regions, such as a lesion in a dental radiograph, are matched automatically to similar images regions in a large clinical database. Regions are described by modal shape descriptors, while other image features, including radiodensity, texture, and size are described using commonly available techniques.
Abstract We describe a geometric-flow-based algorithm for computing a dense oversegmentation of a... more Abstract We describe a geometric-flow-based algorithm for computing a dense oversegmentation of an image, often referred to as superpixels. It produces segments that, on one hand, respect local image boundaries, while, on the other hand, limiting undersegmentation through a compactness constraint. It is very fast, with complexity that is approximately linear in image size, and can be applied to megapixel sized images with high superpixel densities in a matter of minutes.
Graphs derived from medial representations have been used for 2D object matching and retrieval wi... more Graphs derived from medial representations have been used for 2D object matching and retrieval with considerable success (Pelillo et al., 1999; Siddiqi et al., 1999b; Sebastian et al., 2001). In this chapter we consider consider the use of graphs derived from medial surfaces for 3D object matching and retrieval. The medial reprsentation allows for a qualitative abstraction based on a directed acyclic graph of components and also a degree of invariance to a variety of transformations including the articulation of parts.
Abstract We present a model-based framework for incremental, adaptive object shape estimation and... more Abstract We present a model-based framework for incremental, adaptive object shape estimation and tracking in monocular image sequences. Parametric structure and motion estimation methods usually assume a fixed class of shape representation (splines, deformable superquadrics, etc.) that is initialized prior to tracking. Since the model shape coverage is fixed a priori, the incremental recovery of structure is decoupled from tracking, thereby limiting both processes in their scope and robustness.
Abstract The recognition community has typically avoided bridging the representational gap betwee... more Abstract The recognition community has typically avoided bridging the representational gap between traditional, low-level image features and generic models. Instead, the gap has been artificially eliminated by either bringing the image closer to the models using simple scenes containing idealized, textureless objects or by bringing the models closer to the images using 3D CAD model templates or 2D appearance model templates.
abstract A blackboard model of problem solving is applied in the design of a vision system by whi... more abstract A blackboard model of problem solving is applied in the design of a vision system by which an autonomous land vehicle (ALV) navigates roads. The ALV vision task consists of hypothesizing objects in a scene model and verifying these hypotheses using the vehicle's sensors. Object hypothesis generation is based on an a priori map, a planned route through the map, and the current state of the scene model.
ABSTRACT We propose a computer vision system, called a ���visual supervisor���, capable of visual... more ABSTRACT We propose a computer vision system, called a ���visual supervisor���, capable of visually monitoring an environment with the goal of safety monitoring and task verification. We demonstrate our system in a spaceborne setting in which from monocular video sequences, it 1) tracks and recognizes objects, 2) detects workspace violations, and 3) supervises in-progress tasks for anomalous situations, failures, or satisfactory progress.
Detecting contour closure, ie, finding a cycle of disconnected contour fragments that separates a... more Detecting contour closure, ie, finding a cycle of disconnected contour fragments that separates an object from its background, is an important problem in perceptual grouping. Searching the entire space of possible groupings is intractable, and previous approaches have adopted powerful perceptual grouping heuristics, such as proximity and co-curvilinearity, to manage the search.
We have been developing a theory for the generic representation of 2-D shape, where structural de... more We have been developing a theory for the generic representation of 2-D shape, where structural descriptions are derived from the shocks (singularities) of a curve evolution process, acting on bounding contours. We now apply the theory to the problem of shape matching. The shocks are organized into a directed, acyclic shock graph, and complexity is managed by attending to the most significant (central) shape components first.
Abstract We quantify the observation by Kender and Freudenstein (1987) that degenerate views occu... more Abstract We quantify the observation by Kender and Freudenstein (1987) that degenerate views occupy a significant fraction of the viewing sphere surrounding an object. This demonstrates that systems for recognition must explicitly account for the possibility of view degeneracy. We show that view degeneracy cannot be detected from a single camera viewpoint.
Prior to the advent of appearance-based recognition in the early 1990's, object categorization re... more Prior to the advent of appearance-based recognition in the early 1990's, object categorization researchers modeled the prototypical shape of an object, seeking models that were invariant to changes in color, texture, and minor within-class shape deformation. While these categorical models were well-motivated, they could not be reliably recovered from real images of real objects, and eventually gave way to models based on recurring, local, appearance-based features.
Abstract We describe a robot vision system that achieves complex object recognition with two laye... more Abstract We describe a robot vision system that achieves complex object recognition with two layers of behaviors, performing the tasks of planning and object recognition, respectively. The recognition layer is a pipeline in which successive stages take in images from a stereo head, recover relevant features, build intermediate representations, and deposit 3-D objects into a world model. Each stage is an independent process that reacts automatically to output from the previous stage.
Abstract In this paper, we track changes in the appearance of the object as it moves from one fra... more Abstract In this paper, we track changes in the appearance of the object as it moves from one frame to the next. At a symbolic level, an aspect graph clusters all the views of an object into a set of topologically distinct classes in terms of which surfaces of an object are visible from a given viewpoint (Koenderink and van Doom (1979).
Abstract We present an approach to the top-down recognition of volumetric part-based 3-D objects ... more Abstract We present an approach to the top-down recognition of volumetric part-based 3-D objects from a single 2-D image. In previous work 11, 10], we focused on the unexpected object recognition problem and required a complete and consistent interpretation of the image, in terms of a set of 3-D volumetric parts, before matching recovered parts to object models. This requirement was both time consuming and ambitious, particularly with less constrained images of scenes containing less constrained objects.
We present an algorithm for automatically constructing a decompositional shape model from example... more We present an algorithm for automatically constructing a decompositional shape model from examples. Unlike current approaches to structural model acquisition, in which one-to-one correspondences among appearance-based features are used to construct an exemplar-based model, we search for many-to-many correspondences among qualitative shape features (multi-scale ridges and blobs) to construct a generic shape model.
Abstract The mainstream object categorisation community relies heavily on object representations ... more Abstract The mainstream object categorisation community relies heavily on object representations consisting of local image features, due to their ease of recovery and their attractive invariance properties. Object categorisation is therefore formulated as finding, that is,detecting??, a one-to-one correspondence between image and model features. This assumption breaks down for categories in which two exemplars may not share a single local image feature.
In previous work, singular points (or top points) in the scale space representation of generic im... more In previous work, singular points (or top points) in the scale space representation of generic images have proven valuable for image matching. In this paper, we propose a construction that encodes the scale space description of top points in the form of a directed acyclic graph. This representation allows us to utilize graph matching algorithms for comparing images represented in terms of top point configurations instead of using solely the top points and their features in a point matching algorithm, as was done previously.
The problem of segmenting image sequences based on 2D motion has been under study for many years ... more The problem of segmenting image sequences based on 2D motion has been under study for many years now. Most early approaches were either region-based, doing some sort of robust motion estimation, or boundary-based, preferring instead to track the bounding contours of the moving image region. In this paper, we explore an approach based on a synergy between these two previous approaches.
Abstract We consider the use of medial surfaces to represent symmetries of 3-D objects. This allo... more Abstract We consider the use of medial surfaces to represent symmetries of 3-D objects. This allows for a qualitative abstraction based on a directed acyclic graph of components and also a degree of invariance to a variety of transformations including the articulation of parts. We demonstrate the use of this representation for 3-D object model retrieval. Our formulation uses the geometric information associated with each node along with an eigenvalue labeling of the adjacency matrix of the subgraph rooted at that node.
In this paper, we propose a new approach to learn structured visual compound models from shape-ba... more In this paper, we propose a new approach to learn structured visual compound models from shape-based feature descriptions. We use captioned text in order to drive the process of grouping boundary fragments detected in an image. In the learning framework, we transfer several techniques from computational linguistics to the visual domain and build on previous work in image annotation. A statistical translation model is used in order to establish links between caption words and image elements.
ABSTRACT In this paper, we outline the design and development of a clinical radiograph image data... more ABSTRACT In this paper, we outline the design and development of a clinical radiograph image database to be indexed by image content. User-identi ed regions, such as a lesion in a dental radiograph, are matched automatically to similar images regions in a large clinical database. Regions are described by modal shape descriptors, while other image features, including radiodensity, texture, and size are described using commonly available techniques.
Abstract We describe a geometric-flow-based algorithm for computing a dense oversegmentation of a... more Abstract We describe a geometric-flow-based algorithm for computing a dense oversegmentation of an image, often referred to as superpixels. It produces segments that, on one hand, respect local image boundaries, while, on the other hand, limiting undersegmentation through a compactness constraint. It is very fast, with complexity that is approximately linear in image size, and can be applied to megapixel sized images with high superpixel densities in a matter of minutes.
Graphs derived from medial representations have been used for 2D object matching and retrieval wi... more Graphs derived from medial representations have been used for 2D object matching and retrieval with considerable success (Pelillo et al., 1999; Siddiqi et al., 1999b; Sebastian et al., 2001). In this chapter we consider consider the use of graphs derived from medial surfaces for 3D object matching and retrieval. The medial reprsentation allows for a qualitative abstraction based on a directed acyclic graph of components and also a degree of invariance to a variety of transformations including the articulation of parts.