Magnus Burenius - Academia.edu (original) (raw)
Papers by Magnus Burenius
2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013
We consider the problem of automatically estimating the 3D pose of humans from images, taken from... more We consider the problem of automatically estimating the 3D pose of humans from images, taken from multiple calibrated views. We show that it is possible and tractable to extend the pictorial structures framework, popular for 2D pose estimation, to 3D. We discuss how to use this framework to impose view, skeleton, joint angle and intersection constraints in 3D. The 3D pictorial structures are evaluated on multiple view data from a professional football game. The evaluation is focused on computational tractability, but we also demonstrate how a simple 2D part detector can be plugged into the framework.
2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 2011
We present an extension to the scaled orthographic camera model. It deals with dynamic cameras lo... more We present an extension to the scaled orthographic camera model. It deals with dynamic cameras looking at far away objects. The camera is allowed to change focal length and translate and rotate in 3D. The model we derive says that this motion can be treated as scaling, translation and rotation in a 2D image plane. It is valid if the camera and its target move around in two separate regions that are small compared to the distance between them.
Lecture Notes in Computer Science, 2011
This paper focuses on how the accuracy of marker-less human motion capture is affected by the num... more This paper focuses on how the accuracy of marker-less human motion capture is affected by the number of camera views used. Specifically, we compare the 3D reconstructions calculated from single and multiple cameras. We perform our experiments on data consisting of video from multiple cameras synchronized with ground truth 3D motion, obtained from a motion capture session with a professional footballer. The error is compared for the 3D reconstructions, of diverse motions, estimated using the manually located image joint positions from one, two or three cameras. We also present a new bundle adjustment procedure using regression splines to impose weak prior assumptions about human motion, temporal smoothness and joint angle limits, on the 3D reconstruction. The results show that even under close to ideal circumstances the monocular 3D reconstructions contain visual artifacts not present in the multiple view case, indicating accurate and efficient marker-less human motion capture requires multiple cameras.
Procedings of the British Machine Vision Conference 2013, 2013
This paper addresses the problem of human pose estimation, given images taken from multiple dynam... more This paper addresses the problem of human pose estimation, given images taken from multiple dynamic but calibrated cameras. We consider solving this task using a part-based model and focus on the part appearance component of such a model. We use a random forest classifier to capture the variation in appearance of body parts in 2D images. The result of these 2D part detectors are then aggregated across views to produce consistent 3D hypotheses for parts. We solve correspondences across views for mirror symmetric parts by introducing a latent variable. We evaluate our part detectors qualitatively and quantitatively on a dataset gathered from a professional football game.
2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013
We consider the problem of automatically estimating the 3D pose of humans from images, taken from... more We consider the problem of automatically estimating the 3D pose of humans from images, taken from multiple calibrated views. We show that it is possible and tractable to extend the pictorial structures framework, popular for 2D pose estimation, to 3D. We discuss how to use this framework to impose view, skeleton, joint angle and intersection constraints in 3D. The 3D pictorial structures are evaluated on multiple view data from a professional football game. The evaluation is focused on computational tractability, but we also demonstrate how a simple 2D part detector can be plugged into the framework.
2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 2011
We present an extension to the scaled orthographic camera model. It deals with dynamic cameras lo... more We present an extension to the scaled orthographic camera model. It deals with dynamic cameras looking at far away objects. The camera is allowed to change focal length and translate and rotate in 3D. The model we derive says that this motion can be treated as scaling, translation and rotation in a 2D image plane. It is valid if the camera and its target move around in two separate regions that are small compared to the distance between them.
Lecture Notes in Computer Science, 2011
This paper focuses on how the accuracy of marker-less human motion capture is affected by the num... more This paper focuses on how the accuracy of marker-less human motion capture is affected by the number of camera views used. Specifically, we compare the 3D reconstructions calculated from single and multiple cameras. We perform our experiments on data consisting of video from multiple cameras synchronized with ground truth 3D motion, obtained from a motion capture session with a professional footballer. The error is compared for the 3D reconstructions, of diverse motions, estimated using the manually located image joint positions from one, two or three cameras. We also present a new bundle adjustment procedure using regression splines to impose weak prior assumptions about human motion, temporal smoothness and joint angle limits, on the 3D reconstruction. The results show that even under close to ideal circumstances the monocular 3D reconstructions contain visual artifacts not present in the multiple view case, indicating accurate and efficient marker-less human motion capture requires multiple cameras.
Procedings of the British Machine Vision Conference 2013, 2013
This paper addresses the problem of human pose estimation, given images taken from multiple dynam... more This paper addresses the problem of human pose estimation, given images taken from multiple dynamic but calibrated cameras. We consider solving this task using a part-based model and focus on the part appearance component of such a model. We use a random forest classifier to capture the variation in appearance of body parts in 2D images. The result of these 2D part detectors are then aggregated across views to produce consistent 3D hypotheses for parts. We solve correspondences across views for mirror symmetric parts by introducing a latent variable. We evaluate our part detectors qualitatively and quantitatively on a dataset gathered from a professional football game.