Supplemental Material: Efficient Unsupervised Temporal Segmentation of Human Motion (original) (raw)

Efficient Unsupervised Temporal Segmentation of Human Motion

This work introduces an efficient method for fully automatic temporal segmentation of human motion sequences and similar time series. The method relies on a neighborhood graph to partition a given data sequence into distinct activities and motion primitives according to self-similar structures given in that input sequence. In particular, the fast detection of repetitions within the discovered activity segments is a crucial problem of any motion processing pipeline directed at motion analysis and synthesis. The same similarity information in the neighborhood graph is further exploited to cluster these primitives into larger entities of semantic significance. The elements subject to this classification are then used as prior for estimating the same target values for entirely unknown streams of data. The technique makes no assumptions about the motion sequences at hand and no user interaction is required for the segmentation or clustering. Tests of our techniques are conducted on the CMU and HDM05 motion capture databases demonstrating the capability of our system handling motion segmentation, clustering, motion synthesis and transfer-of-label problems in practice -the latter being an optional step which relies on the preexistence of a small set of labeled data.

Efficient Unsupervised Temporal Segmentation of Motion Data

IEEE Transactions on Multimedia, 2016

We introduce a method for automated temporal segmentation of human motion data into distinct actions and compositing motion primitives based on self-similar structures in the motion sequence. We use neighbourhood graphs for the partitioning and the similarity information in the graph is further exploited to cluster the motion primitives into larger entities of semantic significance. The method requires no assumptions about the motion sequences at hand and no user interaction is required for the segmentation or clustering. In addition, we introduce a feature bundling preprocessing technique to make the segmentation more robust to noise, as well as a notion of motion symmetry for more refined primitive detection. We test our method on several sensor modalities, including markered and markerless motion capture as well as on electromyograph and accelerometer recordings. The results highlight our system's capabilities for both segmentation and for analysis of the finer structures of motion data, all in a completely unsupervised manner.

Aligned cluster analysis for temporal segmentation of human motion

2008

Abstract Temporal segmentation of human motion into actions is a crucial step for understanding and building computational models of human motion. Several issues contribute to the challenge of this task. These include the large variability in the temporal scale and periodicity of human actions, as well as the exponential nature of all possible movement combinations. We formulate the temporal segmentation problem as an extension of standard clustering algorithms.

Skeleton-based temporal segmentation of human activities from video sequences

This paper presents a new multi-step, skeleton-based approach for the temporal segmentation of human activities from video sequences. Several signals are first extracted from a skeleton sequence. These signals are then segmented individually to localize their cyclic segments. Finally, all individual segmentations are merged with respect to the global set of signals. Our approach requires no prior knowledge on human activities and can use any generic stick-model. Two different techniques for signal segmentation and for the fusion of the individual segmentations are proposed and tested on a database of fifteen video sequences of variable level of complexity. 1 A skeleton is a generic graph where each node has spatial coordinates. These coordinates are changing over time, thus forming a skeleton sequence.

Unsupervised Temporal Segmentation of Repetitive Human Actions Based on Kinematic Modeling and Frequency Analysis

2015 International Conference on 3D Vision, 2015

In this paper, we propose a method for temporal segmentation of human repetitive actions based on frequency analysis of kinematic parameters, zero-velocity crossing detection, and adaptive k-means clustering. Since the human motion data may be captured with different modalities which have different temporal sampling rate and accuracy (e.g., optical motion capture systems vs. Microsoft Kinect), we first apply a generic full-body kinematic model with an unscented Kalman filter to convert the motion data into a unified representation that is robust to noise. Furthermore, we extract the most representative kinematic parameters via the primary frequency analysis. The sequences are segmented based on zero-velocity crossing of the selected parameters followed by an adaptive k-means clustering to identify the repetition segments. Experimental results demonstrate that for the motion data captured by both the motion capture system and the Microsoft Kinect, our proposed algorithm obtains robust segmentation of repetitive action sequences.

Temporal Human Action Segmentation via Dynamic Clustering

2018

We present an effective dynamic clustering algorithm for the task of temporal human action segmentation, which has comprehensive applications such as robotics, motion analysis, and patient monitoring. Our proposed algorithm is unsupervised, fast, generic to process various types of features, and applicable in both the online and offline settings. We perform extensive experiments of processing data streams, and show that our algorithm achieves the state-of-the-art results for both online and offline settings.

Temporal Motion Recognition and Segmentation Approach

Separating or segmenting complex activities into basic action primitives is important for event recognition and other applications. In this article, simple approaches are presented for appearance-based action recognition, as well as motion segmentation into its action primitives. Optical flow is computed and split into four channels based on four directions, namely, up, down, left, and right. Based on these four motion vectors, motion history and the corresponding energy templates are generated. These are used for action recognition. Moreover, to segment sequential activity, the temporal motion segmentation (TMS) method is proposed based on the concept of history templates. Based on the total pixel volumes on these templates and their related variations, various directions of the action primitives are segmented temporally. This segmentation method can assist an intelligent system or robot to understand activities and take decisions afterwards. It is a simple and real-time approach. Based on the presented experiments, this approach can be very useful in various application areas.

A framework for evaluating motion segmentation algorithms

2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), 2017

There have been many proposals for algorithms segmenting human whole-body motion in the literature. However, the wide range of use cases, datasets, and quality measures that were used for the evaluation render the comparison of algorithms challenging. In this paper, we introduce a framework that puts motion segmentation algorithms on a unified testing ground and provides a possibility to allow comparing them. The testing ground features both a set of quality measures known from the literature and a novel approach tailored to the evaluation of motion segmentation algorithms, termed Integrated Kernel approach. Datasets of motion recordings, provided with a ground truth, are included as well. They are labelled in a new way, which hierarchically organises the ground truth, to cover different use cases that segmentation algorithms can possess. The framework and datasets are publicly available and are intended to represent a service for the community regarding the comparison and evaluation of existing and new motion segmentation algorithms.

Human body motion segmentation in a complex scene

Pattern Recognition, 1987

AIBtract --In this paper, a new technique for partitioning a human body, when it is in motion, into meaningful parts is presented. The technique is based on classifying coincidence edges which are edges in both the difference picture and the current frame. Each class of edges has a specific voting scheme which can then be used for the identification of regions of interest. Experimental results show that the technique can criminate a lot of stationary regions and thus can reduce the amount of processing required in the interpretation process.

PREVIOUS WORK ON TEMPORAL VIDEO SEGMENTATION The work by Koprinksa and Carrato

2014

Temporal segmentation of videos into meaningful image sequences containing some particular activities is an interesting problem in computer vision. We present a novel algorithm to achieve this semantic video segmentation. The segmentation task is accomplished through event detection in a frame-by-frame processing setup. We propose using one-class classification (OCC) techniques to detect events that indicate a new segment, since they have been proved to be successful in object classification and they allow for unsupervised event detection in a natural way. Various OCC schemes have been tested and compared, and additionally, an approach based on the temporal self-similarity maps (TSSMs) is also presented. The testing was done on a challenging publicly available thermal video data-set. The results are promising and show the suitability of our approaches for the task of temporal video segmentation.