Super-Trajectory for Video Segmentation (original) (raw)

Semi-Supervised Video Object Segmentation with Super-Trajectories

IEEE Transactions on Pattern Analysis and Machine Intelligence

We introduce a semi-supervised video segmentation approach based on an efficient video representation, called as "super-trajectory". A super-trajectory corresponds to a group of compact point trajectories that exhibit consistent motion patterns, similar appearances, and close spatiotemporal relationships. We generate the compact trajectories using a probabilistic model, which enables handling of occlusions and drifts effectively. To reliably group point trajectories, we adopt a modified version of the density peaks based clustering algorithm that allows capturing rich spatiotemporal relations among trajectories in the clustering process. We incorporate two intuitive mechanisms for segmentation, called as reverse-tracking and object re-occurrence, for robustness and boosting the performance. Building on the proposed video representation, our segmentation method is discriminative enough to accurately propagate the initial annotations in the first frame onto the remaining frames. Our extensive experimental analyses on three challenging benchmarks demonstrate that, given the annotation in the first frame, our method is capable of extracting the target objects from complex backgrounds, and even reidentifying them after prolonged occlusions, producing high-quality video object segments.

Video Object Segmentation via Super-trajectories

2017

We introduce a novel semi-supervised video segmentation approach based on an efficient video representation, called as "super-trajectory". Each super-trajectory corresponds to a group of compact trajectories that exhibit consistent motion patterns, similar appearance and close spatiotemporal relationships. We generate trajectories using a probabilistic model, which handles occlusions and drifts in a robust and natural way. To reliably group trajectories, we adopt a modified version of the density peaks based clustering algorithm that allows capturing rich spatiotemporal relations among trajectories in the clustering process. The presented video representation is discriminative enough to accurately propagate the initial annotations in the first frame onto the remaining video frames. Extensive experimental analysis on challenging benchmarks demonstrate our method is capable of distinguishing the target objects from complex backgrounds and even reidentifying them after long-t...

Superframes, A Temporal Video Segmentation

2018 24th International Conference on Pattern Recognition (ICPR), 2018

The goal of video segmentation is to turn video data into a set of concrete motion clusters that can be easily interpreted as building blocks of the video. There are some works on similar topics like detecting scene cuts in a video, but there is few specific research on clustering video data into the desired number of compact segments. It would be more intuitive, and more efficient, to work with perceptually meaningful entity obtained from a low-level grouping process which we call it 'superframe'. This paper presents a new simple and efficient technique to detect superframes of similar content patterns in videos. We calculate the similarity of content-motion to obtain the strength of change between consecutive frames. With the help of existing optical flow technique using deep models, the proposed method is able to perform more accurate motion estimation efficiently. We also propose two criteria for measuring and comparing the performance of different algorithms on various databases. Experimental results on the videos from benchmark databases have demonstrated the effectiveness of the proposed method.

A Novel Trajectory Clustering Approach for Motion Segmentation

Lecture Notes in Computer Science, 2010

We propose a novel clustering scheme for spatio-temporal segmentation of sparse motion fields obtained from feature tracking. The approach allows for the segmentation of meaningful motion components in a scene, such as short-and long-term motion of single objects, groups of objects and camera motion. The method has been developed within a project on the analysis of low-quality archive films. We qualitatively and quantitatively evaluate the performance and the robustness of the approach. Results show, that our method successfully segments the motion components even in particularly noisy sequences.

Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions

2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

In this paper, we propose a novel approach to extract primary object segments in videos in the 'object proposal' domain. The extracted primary object regions are then used to build object models for optimized video segmentation. The proposed approach has several contributions: First, a novel layered Directed Acyclic Graph (DAG) based framework is presented for detection and segmentation of the primary object in video. We exploit the fact that, in general, objects are spatially cohesive and characterized by locally smooth motion trajectories, to extract the primary object from the set of all available proposals based on motion, appearance and predicted-shape similarity across frames. Second, the DAG is initialized with an enhanced object proposal set where motion based proposal predictions (from adjacent frames) are used to expand the set of object proposals for a particular frame. Last, the paper presents a motion scoring function for selection of object proposals that emphasizes high optical flow gradients at proposal boundaries to discriminate between moving objects and the background. The proposed approach is evaluated using several challenging benchmark videos and it outperforms both unsupervised and supervised state-of-the-art methods.

Motion Trajectory Segmentation via Minimum Cost Multicuts

2015 IEEE International Conference on Computer Vision (ICCV), 2015

For the segmentation of moving objects in videos, the analysis of long-term point trajectories has been very popular recently. In this paper, we formulate the segmentation of a video sequence based on point trajectories as a minimum cost multicut problem. Unlike the commonly used spectral clustering formulation, the minimum cost multicut formulation gives natural rise to optimize not only for a cluster assignment but also for the number of clusters while allowing for varying cluster sizes. In this setup, we provide a method to create a long-term point trajectory graph with attractive and repulsive binary terms and outperform state-of-the-art methods based on spectral clustering on the FBMS-59 dataset and on the motion subtask of the VSB100 dataset.

Self-supervised Video Object Segmentation

ArXiv, 2020

The objective of this paper is self-supervised representation learning, with the goal of solving semi-supervised video object segmentation (a.k.a. dense tracking). We make the following contributions: (i) we propose to improve the existing self-supervised approach, with a simple, yet more effective memory mechanism for long-term correspondence matching, which resolves the challenge caused by the dis-appearance and reappearance of objects; (ii) by augmenting the self-supervised approach with an online adaptation module, our method successfully alleviates tracker drifts caused by spatial-temporal discontinuity, e.g. occlusions or dis-occlusions, fast motions; (iii) we explore the efficiency of self-supervised representation learning for dense tracking, surprisingly, we show that a powerful tracking model can be trained with as few as 100 raw video clips (equivalent to a duration of 11mins), indicating that low-level statistics have already been effective for tracking tasks; (iv) we de...

First International Workshop on Video Segmentation - Panel Discussion

Lecture Notes in Computer Science, 2015

Interest in video segmentation has grown significantly in recent years, resulting in a large body of works along with advances in both methods and datasets. Progress in video segmentation would enable new approaches to building 3D object models from video, understanding dynamic scenes, robot-object interaction and several other high-level vision tasks. The workshop brought together a broad and representative group of video segmentation researchers working on a wide range of topics. This paper summarizes the panel discussion at the workshop, which focused on three questions: (1) Why does video segmentation currently not meet the performance of image segmentation and what difficulties prevent it from leveraging motion? (2) Is video segmentation a stand-alone problem or should it rather be addressed in combination with recognition and reconstruction? (3) Which are the right video segmentation subtasks the field should focus on, and how can we measure progress?

Motion segmentation by model-based clustering of incomplete trajectories

2011

In this paper, we present a framework for visual object tracking based on clustering trajectories of image key points extracted from a video. The main contribution of our method is that the trajectories are automatically extracted from the video sequence and they are provided directly to a model-based clustering approach.

Video Segmentation with Just a Few Strokes

2015 IEEE International Conference on Computer Vision (ICCV), 2015

As the use of videos is becoming more popular in computer vision, the need for annotated video datasets increases. Such datasets are required either as training data or simply as ground truth for benchmark datasets. A particular challenge in video segmentation is due to disocclusions, which hamper frame-to-frame propagation, in conjunction with non-moving objects. We show that a combination of motion from point trajectories, as known from motion segmentation, along with minimal supervision can largely help solve this problem. Moreover, we integrate a new constraint that enforces consistency of the color distribution in successive frames. We quantify user interaction effort with respect to segmentation quality on challenging ego motion videos. We compare our approach to a diverse set of algorithms in terms of user effort and in terms of performance on common video segmentation benchmarks.