Modified Time Flexible Kernel for Video Activity Recognition using Support Vector Machines (original) (raw)

A Time Flexible Kernel Framework for Video-Based Activity Recognition

Image and Vision Computing, 2016

This work deals with the challenging task of activity recognition in unconstrained videos. Standard methods are based on video encoding of low-level features using Fisher Vectors or Bag of Features. However, these approaches model every sequence into a single vector with fixed dimensionality that lacks any long-term temporal information, which may be important for recognition, especially of complex activities. This work proposes a novel framework with two main technical novelties: First, a video encoding method that maintains the temporal structure of sequences and second a Time Flexible Kernel that allows comparison of sequences of different lengths and random alignment. Results on challenging benchmarks and comparison to previous work demonstrate the applicability and value of our framework.

Video Activity Recognition Using Sequence Kernel Based Support Vector Machines

Lecture Notes in Computer Science, 2019

This paper addresses issues in performing video activity recognition using support vector machines (SVMs). The videos comprise of sequence of sub-activities where a sub-activity correspond to a segment of video. For building activity recognizer, each segment is encoded into a feature vector. Hence a video is represented as a sequence of feature vectors. In this work, we propose to explore GMM-based encoding scheme ot encode a video segment into bag-of-visual-word vector representation. We also propose to use Fisher score vector as an encoded representation for a video segment. For building SVM-based activity recognizer, it is necessary to use suitable kernel that match sequences of feature vectors. Such kernels are called sequence kernels. In this work, we propose different sequence kernels like modified time flexible kernel, segment level pyramid match kernel, segment level probability sequence kernel and segment level Fisher kernel for matching videos when segments are represented using an encoded feature vector representation. The effectiveness of the proposed sequence kernels in the SVM-based activity recognition are studied using benchmark datasets.