Visual tracking Research Papers - Academia.edu (original) (raw)
Visual tracking of unknown objects in unconstrained video-sequences is extremely challenging due to a number of unsolved issues. This thesis explores several of these and examines possible approaches to tackle them. The unconstrained... more
Visual tracking of unknown objects in unconstrained video-sequences is extremely
challenging due to a number of unsolved issues. This thesis explores several of these
and examines possible approaches to tackle them.
The unconstrained nature of real-world input sequences creates huge variation in
the appearance of the target object due to changes in pose and lighting. Additionally,
the object can be occluded by either parts of itself, other elements of the scene, or
the frame boundaries. Observations may also be corrupted due to low resolution,
motion blur, large frame-to-frame displacement, or incorrect exposure or focus of the
camera. Finally, some objects are inherently difficult to track due to their (low) texture,
specular/transparent nature, non-rigid deformations, etc.
Conventional trackers depend heavily on the texture of the target. This causes issues
with transparent or untextured objects. Edge points can be used in cases where
standard feature points are scarce; these however suffer from the aperture problem. To
address this, the first contribution of this thesis explores the idea of virtual corners,
using pairs of non-adjacent line correspondences, tangent to edges in the image. Furthermore,
the chapter investigates the possibility of long-term tracking, introducing a
re-detection scheme to handle occlusions while limiting drift of the object model. The
outcome of this research is an edge-based tracker, able to track in scenarios including
untextured objects, full occlusions and significant length. The tracker, besides reporting
excellent results in standard benchmarks, is demonstrated to successfully track the
longest sequence published to date.
Some of the issues in visual tracking are caused by suboptimal utilisation of the
image information. The object of interest can easily occupy as few as ten or even one
percent of the video frame area. This causes difficulties in challenging scenarios such
as sudden camera shakes or full occlusions. To improve tracking in such cases, the next
major contribution of this thesis explores relationships within the context of visual
tracking, with a focus on causality. These include causal links between the tracked
object and other elements of the scene such as the camera motion or other objects.
Properties of such relationships are identified in a framework based on information
theory. The resulting technique can be employed as a causality-based motion model to
improve the results of virtually any tracker.
Significant effort has previously been devoted to rapid learning of object properties
on the fly. However, state-of-the-art approaches still often fail in cases such as
rapid out-of-plane rotations, when the appearance changes suddenly. One of the major
contributions of this thesis is a radical rethinking of the traditional wisdom of modelling
3D motion as appearance change. Instead, 3D motion is modelled as 3D motion.
This intuitive but previously unexplored approach provides new possibilities in visual
tracking research.
Firstly, 3D tracking is more general, as large out-of-plane motion is often fatal for 2D
trackers, but helps 3D trackers to build better models. Secondly, the tracker's internal
model of the object can be used in many different applications and it could even become
the main motivation, with tracking supporting reconstruction rather than vice versa.
This effectively bridges the gap between visual tracking and Structure from Motion.
The proposed method is capable of successfully tracking sequences with extreme out-of-plane
rotation, which poses a considerable challenge to 2D trackers. This is done by creating
realistic 3D models of the targets, which then aid in tracking.
In the majority of the thesis, the assumption is made that the target's 3D shape is
rigid. This is, however, a relatively strong limitation. In the final chapter, tracking and
dense modelling of non-rigid targets is explored, demonstrating results in even more
generic (and therefore challenging) scenarios. This final advancement truly generalises
the tracking problem with support for long-term tracking of low texture and non-rigid
objects in sequences with camera shake, shot cuts and significant rotation.
Taken together, these contributions address some of the major sources of failure
in visual tracking. The presented research advances the field of visual tracking, facilitating
tracking in scenarios which were previously infeasible. Excellent results are
demonstrated in these challenging scenarios. Finally, this thesis demonstrates that 3D
reconstruction and visual tracking can be used together to tackle difficult tasks.