Predicting People's 3D Poses from Short Sequences (original) (raw)

Abstract

We propose an efficient approach to exploiting motion information from consecutive frames of a video sequence to recover the 3D pose of people. Instead of computing candidate poses in individual frames and then linking them, as is often done, we regress directly from a spatio-temporal block of frames to a 3D pose in the central one. We will demonstrate that this approach allows us to effectively overcome ambiguities and to improve upon the state-of-the-art on challenging sequences.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (41)

A. Agarwal and B. Triggs. 3D Human Pose from Silhouettes by Relevance Vector Regression. In CVPR, 2004.
M. Andriluka, S. Roth, and B. Schiele. Monocular 3D Pose Estimation and Tracking by Detection. In CVPR, 2010.
V. Belagiannis, S. Amin, M. Andriluka, B. Schiele, N. Navab, and S. Ilic. 3D Pictorial Structures for Multiple Human Pose Estimation. In CVPR, 2014.
L. Bo and C. Sminchisescu. Twin Gaussian Processes for Structured Prediction. IJCV, 2010.
L. Bo, C. Sminchisescu, A. Kanaujia, and D. Metaxas. Fast Algorithms for Large Scale Conditional 3D Prediction. In CVPR, June 2008.
M. Burenius, J. Sullivan, and S. Carlsson. 3D Pictorial Struc- tures for Multiple View Articulated Pose Estimation. In CVPR, 2013.
X. Burgos-Artizzu, D. Hall, P. Perona, and P. Dollár. Merg- ing Pose Estimates Across Space and Time. In BMVC, 2013.
C. Cortes, M. Mohri, and J. Weston. A General Regression Technique for Learning Transductions. In ICML, 2005.
J. Deutscher, A. Blake, and I. Reid. Articulated Body Motion Capture by Annealed Particle Filtering. In CVPR, 2000.
P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ra- manan. Object Detection with Discriminatively Trained Part Based Models. PAMI, 32(9), 2010.
V. Ferrari, M. Martin, and A. Zisserman. Progressive Search Space Reduction for Human Pose Estimation. In CVPR, 2008.
J. Gall, B. Rosenhahn, T. Brox, and H.-P. Seidel. Optimiza- tion and Filtering for Human Motion Capture. IJCV, 2010.
M. Hofmann and D. M. Gavrila. Multi-view 3D Human Pose Estimation in Complex Environment. IJCV, 2012.
T. Hofmann, B. Schlkopf, and A. J. Smola. Kernel Methods in Machine Learning. The Annals of Statistics, 36(3):1171- 1220, 2008.
C. Ionescu, J. Carreira, and C. Sminchisescu. Iterated Second-Order Label Sensitive Pooling for 3D Human Pose Estimation. In CVPR, 2014.
C. Ionescu, I. Papava, V. Olaru, and C. Sminchisescu. Hu- man3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments. PAMI, 2014.
A. Kanaujia, C. Sminchisescu, and D. N. Metaxas. Semi- supervised Hierarchical Models for 3D Human Pose Recon- struction. In CVPR, 2007.
A. Kläser, M. Marszałek, and C. Schmid. A Spatio-Temporal Descriptor Based on 3D-Gradients. In BMVC, 2008.
I. Kostrikov and J. Gall. Depth Sweep Regression Forests for Estimating 3D Human Pose from Images. In BMVC, 2014.
I. Laptev. On Space-Time Interest Points. IJCV, 64(2- 3):107-123, 2005.
F. Li, G. Lebanon, and C. Sminchisescu. Chebyshev Ap- proximations to the Histogram χ 2 Kernel. In CVPR, 2012.
S. Li and A. B. Chan. 3D Human Pose Estimation from Monocular Images with Deep Convolutional Network. In ACCV, 2014.
R. Memisevic, L. Sigal, and D. J. Fleet. Shared Kernel In- formation Embedding for Discriminative Inference. PAMI, pages 778-790, April 2012.
D. Park, C. L. Zitnick, D. Ramanan, and P. Dollár. Exploring Weak Stabilization for Motion Feature Extraction. In CVPR, 2013.
D. Ramanan. Learning to Parse Images of Articulated Bod- ies. In NIPS, 2006.
D. Ramanan, A. Forsyth, and A. Zisserman. Strike a Pose: Tracking People by Finding Stylized Poses. In CVPR, 2005.
B. Sapp, A. Toshev, and B. Taskar. Cascaded Models for Articulated Pose Estimation. In ECCV, 2010.
B. Sapp, D. J. Weiss, and B. Taskar. Parsing Human Motion with Stretchable Models. In CVPR, 2011.
J. Shotton, A. Fitzgibbon, M. Cook, and A. Blake. Real- Time Human Pose Recognition in Parts from a Single Depth Image. In CVPR, 2011.
L. Sigal, A. Balan, and M. J. Black. Combined Discrimi- native and Generative Articulated Pose and Non-rigid Shape Estimation. In NIPS, 2007.
L. Sigal, A. Balan, and M. J. Black. Humaneva: Synchro- nized Video and Motion Capture Dataset and Baseline Algo- rithm for Evaluation of Articulated Human Motion. IJCV, 87(1-2):4-27, 2010.
L. Sigal, S. Bhatia, S. Roth, M. J. Black, and M. Isard. Track- ing Loose-limbed People. In CVPR, 2004.
L. Sigal, M. Isard, H. W. Haussecker, and M. J. Black. Loose-limbed People: Estimating 3D Human Pose and Mo- tion Using Non-parametric Belief Propagation. IJCV, 2012.
C. Sminchisescu, A. Kanaujia, Z. Li, and D. Metaxas. Dis- criminative Density Propagation for 3D Human Motion Es- timation. In CVPR, 2005.
C. Sminchisescu and B. Triggs. Covariance Scaled Sampling for Monocular 3D Body Tracking. In CVPR, 2001.
G. W. Taylor, L. Sigal, D. J. Fleet, and G. E. Hinton. Dy- namical binary latent variable models for 3D human pose tracking. In CVPR, 2010.
R. Urtasun and T. Darrell. Sparse Probabilistic Regression for Activity-Independent Human Pose Inference. In CVPR, 2008.
R. Urtasun, D. Fleet, and P. Fua. 3D People Tracking with Gaussian Process Dynamical Models. In CVPR, 2006.
D. Weinland, M. Ozuysal, and P. Fua. Making Action Recog- nition Robust to Occlusions and Viewpoint Changes. In ECCV, September 2010.
F. Zhou and F. D. la Torre. Spatio-Temporal Matching for Human Detection in Video. In ECCV, 2014.
S. Zuffi, J. Romero, C. Schmid, and M. J. Black. Estimating Human Pose with Flowing Puppets. In ICCV, 2013.