Multiple Human Objects Tracking in Crowded Scenes (original) (raw)

Tracking multiple humans in crowded environment

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004.

Tracking of humans in dynamic scenes has been an important topic of research. Most techniques, however, are limited to situations where humans appear isolated and occlusion is small. Typical methods rely on appearance models that must be acquired when the humans enter the scene and are not occluded. We present a method that can track humans in crowded environments, with significant and persistent occlusion by making use of human shape models in addition to camera models, the assumption that humans walk on a plane and acquired appearance models. Experimental results and a quantitative evaluation are included.

Segmentation and tracking of multiple humans in crowded environments

2008

Segmentation and tracking of multiple humans in crowded situations is made difficult by interobject occlusion. We propose a model-based approach to interpret the image observations by multiple partially occluded human hypotheses in a Bayesian framework. We define a joint image likelihood for multiple humans based on the appearance of the humans, the visibility of the body obtained by occlusion reasoning, and foreground/background separation. The optimal solution is obtained by using an efficient sampling method, data-driven Markov chain Monte Carlo (DDMCMC), which uses image observations for proposal probabilities. Knowledge of various aspects, including human shape, camera model, and image cues, are integrated in one theoretically sound framework. We present experimental results and quantitative evaluation, demonstrating that the resulting approach is effective for very challenging data.

Online multiple people tracking-by-detection in crowded scenes

7'th International Symposium on Telecommunications (IST'2014), 2014

Multiple people detection and tracking is a challenging task in real-world crowded scenes. In this paper, we have presented an online multiple people tracking-by-detection approach with a single camera. We have detected objects with deformable part models and a visual background extractor. In the tracking phase we have used a combination of support vector machine (SVM) person-specific classifiers, similarity scores, the Hungarian algorithm and inter-object occlusion handling. Detections have been used for training person-specific classifiers and to help guide the trackers by computing a similarity score based on them and spatial information and assigning them to the trackers with the Hungarian algorithm. To handle inter-object occlusion we have used explicit occlusion reasoning. The proposed method does not require prior training and does not impose any constraints on environmental conditions. Our evaluation showed that the proposed method outperformed the state of the art approaches by 10% and 15% or achieved comparable performance

Tracking Groups of People

Computer Vision and Image Understanding, 2000

A computer vision system for tracking multiple people in relatively unconstrained environments is described. Tracking is performed at three levels of abstraction: regions, people and groups. A novel, adaptive background subtraction method that combines color and gradient information is used to cope with shadows and unreliable color cues. People are tracked through mutual occlusions as they form groups and separate from one another. Strong use is made of color information to disambiguate occlusions and to provide qualitative estimates of depth ordering and position during occlusion. Simple interactions with objects can also be detected. The system is tested using both indoor and outdoor sequences. It is robust and should provide a useful mechanism for bootstrapping and reinitialization of tracking using more specific but less robust human models.

Improved Tracking of Multiple Humans with Trajectory Prediction and Occlusion Modeling

1998

A combined 2D, 3D approach is presented that allows for robust tracking of moving bodies in a given environment as observed via a single, uncalibrated video camera. Lowlevel features are often insufficient for detection, segmentation, and tracking of non-rigid moving objects. Therefore, an improved mechanism is proposed that combines lowlevel (image processing) and mid-level (recursive trajectory estimation) information obtained during the tracking process. The resulting system can segment and maintain the tracking of moving objects before, during, and after occlusion. At each frame, the system also extracts a stabilized coordinate frame of the moving objects. This stabilized frame can be used as input to motion recognition modules. The approach enables robust tracking without constraining the system to know the shape of the objects being tracked beforehand; although, some assumptions are made about the characteristics of the shape of the objects, and how they evolve with time. Experiments in tracking moving people are described.

Detection and matching of multiple occluded moving people for human tracking in colour video sequences

International Journal of …, 2011

The proposed approach aims to track multiple moving people in a colour video acquired with a single camera. The first phase of the approach consists in precisely detecting multi-human inside moving foregrounds. The input to this phase is foreground pixels which were extracted from the scene using any background subtraction technique. These moving foregrounds are then further segmented into multiple moving people using region segmentation and shape-based occlusion handling. The second phase assigns the detected human blobs to tracks using robust matching process based both on appearance model and motion model. For this, we use Kalman filter to predict future locations and sizes for dynamic persons and fuse this information with appearance-based comparison in order to assign each blob to a track. The preliminary experiments on several representative sequences have shown that this unsupervised approach can robustly detect and track multiple occluded moving persons, even at lower temporal resolution.

Tracking people in crowded scenes across multiple cameras

2004

We present a novel approach for continuous detection and tracking of moving objects observed by multiple stationary cameras. We address the tracking problem by simultaneously modeling motion and appearance of the moving objects. The object's appearance is represented using color distribution model invariant to 2D rigid and scale transformation. It provides an efficient blob similarity measure for tracking. The motion models are obtained using a Kalman Filter process, which predicts the position of the moving object in both 2D and 3D. The tracking is performed by the maximization of a joint probability model reflecting objects' motion and appearance. The novelty of our approach consists in integrating multiple cues and multiple views in a Joint Probability Data Association Filter for tracking a large number of moving people with partial and total occlusions. We demonstrate the performance of the proposed method on a soccer game captured by two stationary cameras.

Detection and Tracking of Multiple Objects in Cluttered Backgrounds with Occlusion Handling

The Fourth International Conference on Computer Science Engineering and Applications, 2014

Segmentation and tracking are two important aspects in visual surveillance systems. Many barriers such as cluttered background, camera movements, and occlusion make the robust detection and tracking a difficult problem, especially in case of multiple moving objects. Object detection in the presence of camera noise and with variable or unfavourable luminance conditions is still an active area of research. This paper proposes a framework which can effectively detect the moving objects and track them despite of occlusion and a priori knowledge of objects in the scene. The segmentation step uses a robust threshold decision algorithm which uses a multi-background model. The video object tracking is able to track multiple objects along with their trajectories based on Continuous Energy Minimization. In this work, an effective formulation of multi-target tracking as minimization of a continuous energy is combined with multi-background registration. Apart from the recent approaches, it focus on making use of an energy that corresponds to a more complete representation of the problem, rather than one that is amenable to global optimization. Besides the image evidence, the energy function considers physical constraints, such as target dynamics, mutual exclusion, and track persistence. The proposed tracking framework is able to track multiple objects despite of occlusions under dynamic background conditions.

Tracking multiple people with recovery from partial and total occlusion

Pattern Recognition, 2005

Robust tracking of multiple people in video sequences is a challenging task. In this paper, we present an algorithm for tracking faces of multiple people even in cases of total occlusion. Faces are detected first; then a model for each person is built. The models are handed over to the tracking module which is based on the mean shift algorithm, where each face is represented by the non-parametric distribution of the colors in the face region. The mean shift tracking algorithm is robust to partial occlusion and rotation, and is computationally efficient, but it does not deal with the problem of total occlusion. Our algorithm overcomes this problem by detecting the occlusion using an occlusion grid, and uses a non-parametric distribution of the color of the occluded person's cloth to distinguish that person after the occlusion ends. Our algorithm uses the speed and the trajectory of each occluded person to predict the locations that should be searched after occlusion ends. It integrates multiple features to handle tracking multiple people in cases of partial and total occlusion. Experiments on a large set of video clips demonstrate the robustness of the algorithm, and its capability to correctly track multiple people even when faces are temporarily occluded by other faces or by other objects in the scene.