Real-time multisensor people tracking for human-robot spatial interaction (original) (raw)

RGB-D, Laser and Thermal Sensor Fusion for People Following in a Mobile Robot

International Journal of Advanced Robotic Systems, 2013

Detecting and tracking people is a key capability for robots that operate in populated environments. In this paper, we used a multiple sensor fusion approach that combines three kinds of sensors in order to detect people using RGB-D vision, lasers and a thermal sensor mounted on a mobile platform. The Kinect sensor offers a rich data set at a significantly low cost, however, there are some limitations to its use in a mobile platform, mainly that the Kinect algorithms for people detection rely on images captured by a static camera. To cope with these limitations, this work is based on the combination of the Kinect and a Hokuyo laser and a thermopile array sensor. A real-time particle filter system merges the information provided by the sensors and calculates the position of the target, using probabilistic leg and thermal patterns, image features and optical flow to this end. Experimental results carried out with a mobile platform in a Science museum have shown that the combination of different sensory cues increases the reliability of the people following system.

Fast RGB-D people tracking for service robots

Autonomous Robots, 2014

Service robots have to robustly follow and interact with humans. In this paper, we propose a very fast multi-people tracking algorithm designed to be applied on mobile service robots. Our approach exploits RGB-D data and can run in real-time at very high frame rate on a standard laptop without the need for a GPU implementation. It also features a novel depthbased sub-clustering method which allows to detect people within groups or even standing near walls. Moreover, for limiting drifts and track ID switches, an online learning appearance classifier is proposed featuring a three-term joint likelihood. We compared the performances of our system with a number of state-of-the-art tracking algorithms on two public datasets acquired with three static Kinects and a moving stereo pair, respectively. In order to validate the 3D accuracy of our system, we created a new dataset in which RGB-D data are acquired by a moving robot. We made publicly available this dataset which is not only annotated by hand, but the ground-truth position of people and robot are acquired with a motion capture system in order to evaluate tracking accuracy and precision in 3D coordinates. Results of experiments on these datasets are presented, showing that, even without the need for a GPU, our approach achieves state-of-the-art accuracy and superior speed. (a) (b) Fig. 1 Example of our system output: (a) a 3D bounding box is drawn for every tracked person on the RGB image, (b) the corresponding 3D point cloud is reported, together with the estimated people trajectories.

Real-Time Multiple Human Perception with Color-Depth Cameras on a Mobile Robot

The ability to perceive humans is an essential requirement for safe and efficient human-robot interaction. In realworld applications, the need for a robot to interact in real time with multiple humans in a dynamic, 3-D environment presents a significant challenge. The recent availability of commercial color-depth cameras allow for the creation of a system that makes use of the depth dimension, thus enabling a robot to observe its environment and perceive in the 3-D space. Here we present a system for 3-D multiple human perception in real time from a moving robot equipped with a color-depth camera and a consumer-grade computer. Our approach reduces computation time to achieve real-time performance through a unique combination of new ideas and established techniques. We remove the ground and ceiling planes from the 3-D point cloud input to separate candidate point clusters. We introduce the novel information concept, depth of interest, which we use to identify candidates for detection, and that avoids the computationally expensive scanning-window methods of other approaches. We utilize a cascade of detectors to distinguish humans from objects, in which we make intelligent reuse of intermediary features in successive detectors to improve computation. Because of the high computational cost of some methods, we represent our candidate tracking algorithm with a decision directed acyclic graph, which allows us to use the most computationally intense techniques only where necessary. We detail the successful implementation of our novel approach on a mobile robot and examine its performance in scenarios with real-world challenges, including occlusion, robot motion, nonupright humans, humans leaving and reentering the field of view (i.e., the reidentification challenge), human-object and human-human interaction. We conclude with the observation that the incorporation of the depth information, together with the use of modern techniques in new ways, we are able to create an accurate system for real-time 3-D perception of humans by a mobile robot.

2D Laser and 3D Camera Data Integration and Filtering for Human Trajectory Tracking

2021 IEEE/SICE International Symposium on System Integration (SII), 2021

This paper addresses a robust human trajectory tracking method through the data integration of 2D laser scanner and 3D camera. Mapping the deep learning-based 3D camera human detection onto the point cloud of the depth information to build up the 3D bounding box-represented human and using the state-of-the-art 2D laser-based leg detection are the main data streams of the human tracking system. The human-oriented global nearest neighbour (HOGNN) data association, inspired from the Hall’s proxemics, was developed to improve both the 3D camera-based and 2D laser-based human detection techniques. The dual Kalman filters are employed to track the human trajectory in parallel. The integration of the 3D camera-based and 2D laser-based human tracking is the key function of the system providing the real-time feedback for both the HOGNN to reduce false-positives of the camera-based and laser-based human detection and the Kalman filter to enhance the quality of the human trajectory tracking un...

People Detection and Tracking Using LIDAR Sensors

Robotics, 2019

The tracking of people is an indispensable capacity in almost any robotic application. A relevant case is the @home robotic competitions, where the service robots have to demonstrate that they possess certain skills that allow them to interact with the environment and the people who occupy it; for example, receiving the people who knock at the door and attending them as appropriate. Many of these skills are based on the ability to detect and track a person. It is a challenging problem, particularly when implemented using low-definition sensors, such as Laser Imaging Detection and Ranging (LIDAR) sensors, in environments where there are several people interacting. This work describes a solution based on a single LIDAR sensor to maintain a continuous identification of a person in time and space. The system described is based on the People Tracker package, aka PeTra, which uses a convolutional neural network to identify person legs in complex environments. A new feature has been includ...

Long-Term Estimation of Human Spatial Interactions Through Multiple Laser Ranging Sensors

—For robots to be able to naturally co-habitate human spaces, as well as to interact with single humans or groups of humans, they should be able to navigate in ways that are human-friendly, and appropriate to human spatial interaction social norms, such as keeping personal spaces. For that purpose, we are developing a special theory which extends path planning, which we call social path plan, which allows humans or groups as obstacles or goals. In order to provide tuning for our simulation results, we are acquiring a natural human interaction dataset, through measurements from multiple laser ranging sensors positioned at a crossroads indoor space. We thus describe our system consisting of spatial and temporal alignment algorithms for multiple laser sensors, as well as foreground detection, sensor data fusion, segmentation, tracking, two-legged position and pose estimation, and event detection. The method presented can be easily extended to larger spaces and applied for many other application domains beyond our main goal of learning optimal spatial interaction behaviours for human-robot interaction.

Laser-Based Tracking of Human Position and Orientation Using Parametric Shape Modeling

Advanced Robotics, 2009

Robots designed to interact socially with people require reliable estimates of human position and motion. Additional pose data such as body orientation may enable a robot to interact more effectively by providing a basis for inferring contextual social information such as people's intentions and relationships. To this end, we have developed a system for simultaneously tracking the position and body orientation of many people, using a network of laser range finders mounted at torso height. An individual particle filter is used to track the position and velocity of each human, and a parametric shape model representing the person's cross-sectional contour is fit to the observed data at each step. We demonstrate the system's tracking accuracy quantitatively in laboratory trials, and we present results from a field experiment observing subjects walking through the lobby of a building. The results show that our method can closely track torso and arm movements even with noisy and incomplete sensor data, and we present examples of social information observable from this orientation and positioning information that may be useful for social robots.

Human Detection & Following by a Mobile Robot using 3D Features

—Human-robot interaction is one of the most basic requirements for service robots. In order to provide the desired service, these robots are required to detect and track human beings in the environment. This paper presents a novel approach for classifying a target person in a crowded environment. The system used the approaches for human detection and following by implementing the multi-sensor data fusion technique using stereo camera and laser range finder (LRF). Our system tracks human being by gathering features of human upper body and face in 3D space from stereo camera and uses laser rangefinder to get legs data. Using these data our system classifies the target person from other human beings in the environment. We used Haar cascade classifiers for the detection of upper body and face, and used stereo camera for getting dimensions in 3D space. The approach for gathering legs data is based on the recognition of legs pattern extracted from laser scan. Tracking of target person is done using Cam Shift theorem. Using all these techniques we present a novel approach for target person classification and tracking. Our approach is feasible for mobile robots with an identical device arrangement.

Person Detection, Tracking and Identification by Mobile Robots Using RGB-D Images

2015

This dissertation addresses the use of RGB-D images for six important tasks of mobile robots: face detection, face tracking, face pose estimation, face recognition, person detection and person tracking. These topics have widely been researched in recent years because they provide mobile robots with abilities necessary to communicate with humans in natural ways. The RGB-D images from a Microsoft Kinect cameras are expected to play an important role in improving both accuracy and computational costs of the proposed algorithms for mobile robots. We contribute some applications of the Microsoft Kinect camera for mobile robots and show their effectiveness by doing realistic experiments on our mobile robots. An important component for mobile robots to interact with humans in a natural way is real time multiple face detection. Various face detection algorithms for mobile robots have been proposed; however, almost all of them have not yet met the requirements of accuracy and speed to run in real time on a robot platform. In the scope of our research, we have developed a method of combining color and depth images provided by a Kinect camera and navigation information for face detection on mobile robots. We demonstrate several experiments with challenging datasets. Our results show that this method improves the accuracy and computational costs, and it runs in real time in indoor environments. Tracking faces in uncontrolled environments has still remained a challenging task because the face as well as the background changes quickly over time and the face often moves through different illumination conditions. RGB-D images are beneficial for this task because the mobile robot can easily estimate the face size and improve the performance of face tracking in different distances between the mobile robot and the human. In this dissertation, we present a real time algorithm for mobile robots to track human faces accurately despite the fact that humans can move freely and far away from the camera or go through different illumination conditions in uncontrolled environments. We combine the algorithm of an adaptive correlation filter (Bolme et al. (2010)) with a Viola-Jones object detection (Viola and Jones (2001b)) to track the face. Furthermore, we introduce a new technique of face pose estimation, which is applied after tracking the face. On the tracked face, the algorithm of an adaptive correlation filter with a Viola-Jones object detection is also applied to reliably track the facial features including the two external eye corners and the nose. These facial features provide geometric cues to estimate the face pose robustly. We carefully analyze the accuracy of these approaches based on different datasets and show how they can robustly run on a mobile robot in uncontrolled environments.

People Detection from Laser Range Finder Data

2019

The integration of robots in our daily lives became an acquired reality over the last few years. Connected to the quality of this integration is the perception that those robots have about the environments where they are placed. Therefore, having an understanding of their surroundings, in a social setting, is a synonym of the necessity of identifying people that are in the environment under analysis. This dissertation aims that purpose, People Detection, more specifically, their legs. Contrary to some work already done in this context, the working principle of the method presented in this dissertation seeks, above all, to estimate the disposition of the legs during the various stages of a normal walk. Those dispositions, performed by any passer-by, are represented by geometric patterns that slide along all the data collected by a Laser Range Finder. The main goal of this sliding process is to check if there is some correspondence between the patterns and the data collected which ref...