Viewpoint-free Video Synthesis with an Integrated 4D System (original) (raw)
Related papers
RECOVER3D: A Hybrid Multi-View System for 4D Reconstruction of Moving Actors
Proceedings of the 4th International Conference on 3D Body Scanning Technologies, Long Beach CA, USA, 19-20 November 2013, 2013
4D multi-view reconstruction of moving actors has many applications in the entertainment industry and although studios providing such services become more accessible, efforts have to be done in order to improve the underlying technology and to produce high-quality 3D contents. The RECOVER3D project aim is to elaborate an integrated virtual video system for the broadcast and motion pictures markets. In particular, we present a hybrid acquisition system coupling mono and multiscopic video cameras where actor's performance is captured as 4D data set: a sequence of 3D volumes over time. The visual improvement of the software solutions being implemented relies on "silhouette-based" techniques and (multi-)stereovision, following several hybridization scenarios integrating GPU-based processing. Afterwards, we transform this sequence of independent 3D volumes in a unique dynamic mesh. Our approach is based on a motion estimation procedure. An adaptive signed volume distance function is used as the principal shape descriptor and an optical flow algorithm is adapted to the surface setting with a modification that minimizes the interference between unrelated surface regions.
Multi-view 4D Reconstruction of Human Action for Entertainment Applications
Visual Analysis of Humans, 2011
Multi-view 4D reconstruction of human action has a number of applications in entertainment. This chapter describes a selection of application areas that are of interest to the broadcast, movie and gaming industries. In particular freeviewpoint video techniques for special effects and sport post-match analysis are discussed. The appearance of human action is captured as 4D data represented by 3D volumetric or surface data over time. A review of recent approaches identifies two major classes: 4D reconstruction and model-based tracking. The second part of the chapter describes aspects of a practical implementation of a 4D reconstruction pipeline. Implementations of the popular visual hull are discussed, as a building block in many free-viewpoint video systems.
A Framework for Fast Prototyping of Photo-realistic Environments with Multiple Pedestrians
2023 IEEE International Conference on Robotics and Automation (ICRA)
Robotic applications involving people often require advanced perception systems to better understand complex real-world scenarios. To address this challenge, photorealistic and physics simulators are gaining popularity as a means of generating accurate data labeling and designing scenarios for evaluating generalization capabilities, e.g., lighting changes, camera movements or different weather conditions. We develop a photo-realistic framework built on Unreal Engine and AirSim to generate easily scenarios with pedestrians and mobile robots. The framework is capable to generate random and customized trajectories for each person and provides up to 50 ready-to-use people models along with an API for their metadata retrieval. We demonstrate the usefulness of the proposed framework with a use case of multi-target tracking, a popular problem in real pedestrian scenarios. The notable feature variability in the obtained perception data is presented and evaluated.
Tracking and Modeling People in Video Sequences
Computer Vision and Image Understanding, 2001
Tracking and modeling people from video sequences has become an increasingly important research topic, with applications including animation, surveillance and sports medicine. In this paper, we propose a model based 3-D approach to recovering both body shape and motion. It takes advantage of a sophisticated animation model to achieve both robustness and realism. Stereo sequences of people in motion serve as input to our system. From these, we extract a 2 1 2 -D description of the scene and, optionally, silhouette edges. We propose an integrated framework to fit the model and to track the person's motion. The environment does not have to be engineered. We recover not only the motion but also a full animation model closely resembling the subject. We present results of our system on real sequences and we show the generic model adjusting to the person and following various kinds of motion.
Viewpoint Independent Human Motion Analysis in Man-made Environments
Procedings of the British Machine Vision Conference 2006, 2006
This work addresses the problem of human motion analysis in video sequences of a scene observed by a single fixed camera with high perspective effect. The goal of this work is to make a 2D-Model (made of Shape and Stick figure) viewpoint-insensitive and preprocess the input image for removing the perspective effect. We focus our methodology on using the 3D principal directions of man-made environments and also the direction of motion to transform both 2D-Model and input images to a common frontal view (parallel or orthogonal to the direction of motion) before the fitting process. The inverse transformation is then performed on the resulting human features obtaining a segmented silhouette and a pose estimation in the original input image. Preliminary results are very promising since the proposed algorithm is able to locate head and feet with a better precision than previous one.
2008 Ieee/rsj International Conference on Robots and Intelligent Systems, Vols 1-3, Conference Proceedings, 2008
This paper presents a novel people detection and tracking method based on a combined multimodal sensor approach that utilizes 2D and 3D laser range and camera data. Laser data points are clustered and classified with a set of geometrical features using an SVM AdaBoost method. The clusters define a region of interest in the image that is adjusted using the ground plane information extracted from the 3D laser. In this areas a novel vision based people detector based on Implicit Shape Model (ISM) is applied. Each detected person is tracked using a greedy data association technique and multiple Extended Kalman Filters that use different motion models. This way, the filter can cope with a variety of different motion patterns. The tracker is asynchronously updated by the detections from the laser and the camera data. Experiments conducted in real-world outdoor scenarios with crowds of pedestrians demonstrate the usefulness of our approach.
Towards 4D Virtual City Reconstruction from Lidar Point Cloud Sequences
ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, 2013
In this paper we propose a joint approach on virtual city reconstruction and dynamic scene analysis based on point cloud sequences of a single car-mounted Rotating Multi-Beam (RMB) Lidar sensor. The aim of the addressed work is to create 4D spatio-temporal models of large dynamic urban scenes containing various moving and static objects. Standalone RMB Lidar devices have been frequently applied in robot navigation tasks and proved to be efficient in moving object detection and recognition. However, they have not been widely exploited yet for geometric approximation of ground surfaces and building facades due to the sparseness and inhomogeneous density of the individual point cloud scans. In our approach we propose an automatic registration method of the consecutive scans without any additional sensor information such as IMU, and introduce a process for simultaneously extracting reconstructed surfaces, motion information and objects from the registered dense point cloud completed with point time stamp information.
Probabilistic Spatio-temporal 2D-Model for Pedestrian Motion Analysis in Monocular Sequences
2006
This paper addresses the problem of probabilistic modelling of human motion by combining several 2D views. This method takes advantage of 3D information avoiding the use of a complex 3D model. Considering that the main disadvantage of 2D models is their restriction to the camera angle, a solution to this limitation is proposed in this paper. A multi-view Gaussian Mixture Model (GMM) is therefore fitted to a feature space made of Shapes and Stick figures manually labelled. Temporal and spatial constraints are considered to build a probabilistic transition matrix. During the fitting, this matrix limits the feature space only to the most probable models from the GMM. Preliminary results have demonstrated the ability of this approach to adequately estimate postures independently of the direction of motion during the sequence.