Event recognition using object-motion context (original) (raw)
Related papers
Features Preserving Video Event Detection using Relative Motion Histogram of Bag of Visual Words
2016
ARTICLE INFO Incident discovery in video is a investigate vicinity which attempts to build up a computer system with the capability to robotically read the video and locate the happening from images. Now a day there is a huge demand to find the Event depending on motion relativity and feature selection. In this paper, we propose our predictive based on movement relativity and feature selection for video incident discovery. Furthermost, we recommend a new movement feature, namely The concept of Expanded Relative Motion Histogram of Bag-of-Visual-Words (ERMH-BoW) is used for motion relativity and event detection. In ERMH-BoW, by representing what aspect of an event detection with Bag-of-Visual-Words(BoW), we construct relative motion histograms between different visual words to show the objects activities or how aspect of the event. ERMH-BoW thus integrates both what and how aspects for a absolute event description. Meanwhile, we show that by employing motion relativity, it is invaria...
Estimation and representation of accumulated motion characteristics for semantic event detection
Image Processing, IEEE International Conference, 2008
In this paper, a motion-based approach for detecting high-level se- mantic events in video sequences is presented. Its main character- istic is its generic nature, i.e. it can be directly applied to any pos- sible domain of concern without the need for domain-specific algo- rithmic modifications or adaptations. For realizing event detection, the examined video sequence is initially segmented into
Semantic Video Model for Description, Detection and Retrieval of Visual Events
2010
Cette thèse consiste à explorer l'usage d'outils de support de la sémantique des données dans le domaine du multimédia. La première contribution concerne la génération de descriptions de haut-niveau. Nous proposons un langage de description de haut-niveau qui permet la définition d'événements et d'objets à partir de caractéristiques de bas-niveau. La deuxième contribution concerne l'exploration de certains types de raisonnement dans le contexte de la multimédia sémantique. Nous proposons un langage sémantique (fondé sur les graphes conceptuels flous) pour la description des vidéos et définissons des mécanismes de raisonnement sous-jacents. La troisième contribution porte sur l'indexation et la recherche sémantique dans les bases de données multimédia. Nous proposons un langage de requêtes issu des bases de données déductives pour l'expression de requêtes spatiotemporelles et sémantiquesThis thesis is about to explore the use of tools to support semantics ...
2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014
Automatically segmenting and recognizing human activities from observations typically requires a very complex and sophisticated perception algorithm. Such systems would be unlikely implemented on-line into a physical system, such as a robot, due to the pre-processing step(s) that those vision systems usually demand. In this work, we present and demonstrate that with an appropriate semantic representation of the activity, and without such complex perception systems, it is sufficient to infer human activities from videos. First, we will present a method to extract the semantic rules based on three simple hand motions, i.e. move, not move and tool use. Additionally, the information of the object properties either ObjectActedOn or ObjectInHand are used. Such properties encapsulate the information of the current context. The above data is used to train a decision tree to obtain the semantic rules employed by a reasoning engine. This means, we extract lower-level information from videos and we reason about the intended human behaviors (high-level). The advantage of the abstract representation is that it allows to obtain more generic models out of human behaviors, even when the information is obtained from different scenarios. The results show that our system correctly segments and recognizes human behaviors with an accuracy of 85%. Another important aspect of our system is its scalability and adaptability toward new activities, which can be learned on-demand. Our system has been fully implemented on a humanoid robot, the iCub to experimentally validate the performance and the robustness of our system during on-line execution of the robot.
Improved Semantic-Based Human Interaction Understanding Using Context-Based Knowledge
Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on, 2013
This paper proposes a descriptive approach for context-based human activity analysis through an hierarchical framework in a scene understanding application. Each human movement with respect to himself, others and scene, can arise different layers of human activities analysis, which usually investigated separately depend on the application. Human behaviour can not be analysed properly, since the all different layers of information were not considered. The effect of using the different layers of information to increase the accuracy of the analysis is presented in the study. The contributions are, using different information layers such as human body parts movement and human-object interaction, in 3D space, to improve human activity analysis, and proposing a probabilistic and descriptive model, based on a well-known human movement descriptor and Bayesian Network (BN) approach. Thus, based on the mentioned framework, the model is generalizable and flexible which are necessary for having such an applicable system. The capability of the proposed approach is presented in the experiment's section.
Semantic Object Detection for Human Activity Monitoring System
Journal of Telecommunication, Electronic and Computer Engineering, 2018
Semantic object detection is significant for activity monitoring system. Any abnormalities occurred in a monitored area can be detected by applying semantic object detection that determines any displaced objects in the monitored area. Many approaches are being made nowadays towards better semantic object detection methods, but the approaches are either resource consuming such as using sensors that are costly or restricted to certain scenarios and background only. We assume that the scale structures and velocity can be estimated to define a different state of activity. This project proposes Histogram of Oriented Gradient (HOG) technique to extract feature points of semantic objects in the monitored area while Histogram of Oriented Optical Flow (HOOF) technique is used to annotate the current state of the semantic object that having human-andobject interaction. Both passive and active objects are extracted using HOG, and HOOF descriptor indicate the time series status of the spatial a...
IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2000
Understanding Video Events, the translation of low-level content in video sequences into highlevel semantic concepts, is a research topic that has received much interest in recent years. Important applications of this work include smart surveillance systems, semantic video database indexing, and interactive systems. This technology can be applied to several video domains including: airport terminal, parking lot, traffic, subway stations, aerial surveillance, and sign language data. In this work we survey the two main components of the event understanding process: Abstraction and Event modeling. Abstraction is the process of molding the data into informative units to be used as input to the event model. Event modeling is devoted to describing events of interest formally and enabling recognition of these events as they occur in the video sequence. Event modeling can be further decomposed in the categories of Pattern Recognition Methods, State Event Models, and Semantic Event Models. In this survey we discuss this proposed taxonomy of the literature, offer a unifying terminology, and discuss popular abstraction schemes (e.g. Motion History Images) and event modeling formalisms (e.g. Hidden Markov Model) and their use in video event understanding using extensive examples from the literature. Finally we consider the application domain of video event understanding in light of the proposed taxonomy, and propose future directions for research in this field. 1
A cognitive vision system for action recognition in office environments
2004
The emerging cognitive vision paradigm is concerned with vision systems that evaluate, gather and integrate contextual knowledge for visual analysis. In reasoning about events and structures, cognitive vision systems should rely on multiple computations in order to perform robustly even in noisy domains. Action recognition in an unconstrained office environment thus provides an excellent testbed for research on cognitive computer vision. In this contribution, we present a system that consists of several computational modules for object and action recognition. It applies attention mechanisms, visual learning and contextual as well as probabilistic reasoning to fuse individual results and verify their consistency. Database technologies are used for information storage and an XML based communication framework integrates all modules into a consistent architecture.
A context‐awareness model for activity recognition in robot‐assisted scenarios
Expert Systems, 2019
Context awareness in ambient assisted living programmes for the elderly is a cornerstone in the current scenario of noncustomized service robots distributed around the world. This research proposes a context-awareness system for a human-robot scene interpretation based on seven primary contexts and the American Occupational Therapy Association. The context-awareness system defined here proposes an inference mechanism for the activity recognition supported on hierarchical Bayesian networks. However, when the information from sensors increases, the computational cost associated also increases. Thus, an evaluation of different Bayesian network models is necessary for decreasing its impact over the robot performance. Two topological models have been modelled and tested using OpenMarkov application: a two-level approach of an input-observations layer and the activity recognition layer, and a three-layer model setting apart a primary contexts layer, the input-observations layer, and the activity recognition layer. The qualitative and quantitative results presented here show better performance in terms of memory and memory in a three-layer model. Besides, its effect on a hybrid architecture of a robotic platform is presented.