Jose Alonso Ybáñez Zepeda - Academia.edu (original) (raw)

Papers by Jose Alonso Ybáñez Zepeda

Research paper thumbnail of Proxy Clouds for RGB-D Stream Processing: An Insight

HAL (Le Centre pour la Communication Scientifique Directe), Jul 1, 2017

Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeli... more Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through gaming. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. In order to cope with all these issues, we present Proxy Clouds, a multiplanar superstructure for unified real-time processing of RGB-D data. By generating and updating through time a single set of rich statistics parameterized over planar proxies from raw RGB-D data, several processing primitives can be applied to improve the quality of the RGB-D stream on-the-fly or lighten further operations. We illustrate the use of Proxy Clouds on several applications, including noise and temporal flickering removal, hole filling, resampling, color processing and compression. We present experiments performed with our framework in indoor scenes of different natures captured with a consumer depth sensor.

Research paper thumbnail of Proxy clouds for RGB-D stream processing

ACM SIGGRAPH 2017 Talks, 2017

Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeli... more Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through gaming. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. In order to cope with all these issues, we present Proxy Clouds, a multiplanar superstructure for unified real-time processing of RGB-D data. By generating and updating through time a single set of rich statistics parameterized over planar proxies from raw RGB-D data, several processing primitives can be applied to improve the quality of the RGB-D stream on-the-fly or lighten further operations. We illustrate the use of Proxy Clouds on several applications, including noise and temporal flickering removal, hole filling, resampling, color processing and compression. We present experiments performed with our framework in indoor scenes of different natures captured with a consumer depth sensor.

Research paper thumbnail of A Survey of Simple Geometric Primitives Detection Methods for Captured 3D Data

Computer Graphics Forum, 2018

The amount of captured 3D data is continuously increasing, with the democratization of consumer d... more The amount of captured 3D data is continuously increasing, with the democratization of consumer depth cameras, the development of modern multi‐view stereo capture setups and the rise of single‐view 3D capture based on machine learning. The analysis and representation of this ever growing volume of 3D data, often corrupted with acquisition noise and reconstruction artefacts, is a serious challenge at the frontier between computer graphics and computer vision. To that end, segmentation and optimization are crucial analysis components of the shape abstraction process, which can themselves be greatly simplified when performed on lightened geometric formats. In this survey, we review the algorithms which extract simple geometric primitives from raw dense 3D data. After giving an introduction to these techniques, from the acquisition modality to the underlying theoretical concepts, we propose an application‐oriented characterization, designed to help select an appropriate method based on ...

Research paper thumbnail of Proxy Clouds for RGB-D Stream Processing: A Preview

Research paper thumbnail of Plane Pair Matching for Efficient 3D View Registration

ArXiv, 2020

We present a novel method to estimate the motion matrix between overlapping pairs of 3D views in ... more We present a novel method to estimate the motion matrix between overlapping pairs of 3D views in the context of indoor scenes. We use the Manhattan world assumption to introduce lightweight geometric constraints under the form of planes into the problem, which reduces complexity by taking into account the structure of the scene. In particular, we define a stochastic framework to categorize planes as vertical or horizontal and parallel or non-parallel. We leverage this classification to match pairs of planes in overlapping views with point-of-view agnostic structural metrics. We propose to split the motion computation using the classification and estimate separately the rotation and translation of the sensor, using a quadric minimizer. We validate our approach on a toy example and present quantitative experiments on a public RGB-D dataset, comparing against recent state-of-the-art methods. Our evaluation shows that planar constraints only add low computational overhead while improvin...

Research paper thumbnail of Geometric Proxies for Live RGB-D Stream Enhancement and Consolidation

ArXiv, 2020

We propose a geometric superstructure for unified real-time processing of RGB-D data. Modern RGB-... more We propose a geometric superstructure for unified real-time processing of RGB-D data. Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through augmented reality. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. Our approach consists in generating and updating through time a single set of compact local statistics parameterized over detected geometric proxies, which are fed from raw RGB-D data. Our proxies provide several processing primitives, which improve the quality of the RGB-D stream on the fly or lighten further operations. Experimental results confirm that our lightweight analysis framework copes well with embedded execution as well as moderate memory and computational capabilities compared to state-of-the-art methods. Processing RGB-D data with our proxies allows noise and temporal flickering removal, hole filling and re...

Research paper thumbnail of Proxy Clouds for Live RGB-D Stream Processing and Consolidation

Computer Vision – ECCV 2018, 2018

We propose a new multiplanar superstructure for unified real-time processing of RGB-D data. Moder... more We propose a new multiplanar superstructure for unified real-time processing of RGB-D data. Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through augmented reality. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. Our approach, named Proxy Clouds, consists in generating and updating through time a single set of compact local statistics parameterized over detected planar proxies, which are fed from raw RGB-D data. Proxy Clouds provide several processing primitives, which improve the quality of the RGB-D stream on-the-fly or lighten further operations. Experimental results confirm that our light weight analysis framework copes well with embedded execution as well as moderate memory and computational capabilities compared to stateof-the-art methods. Processing of RGB-D data with Proxy Clouds includes noise and temporal flickering removal, hole filling and resampling. As a substitute of the observed scene, our proxy cloud can additionally be applied to compression and scene reconstruction. We present experiments performed with our framework in indoor scenes of different natures within a recent open RGB-D dataset.

Research paper thumbnail of Estimation linéaire de la pose tridimensionnelle d'un visage et de ses actions faciales

L'objectif de la these est le suivi de visages et d'animations faciales dans des sequence... more L'objectif de la these est le suivi de visages et d'animations faciales dans des sequences video. Apres avoir introduit le sujet, nous proposerons une premiere methode permettant de suivre la pose 3D et les animations faciales du visage detecte au debut d'une sequence video. Nous presenterons ensuite une methode permettant d'initialiser le suivi du visage, en estimant la pose et la forme d'un visage inconnu, a partir d'une base de visages vus de face. Pour atteindre ces deux objectifs (initialisation et suivi), differentes approches seront decrites, utilisant un modele geometrique de visage et une mise en correspondance de deux ensembles de variables: les perturbations du modele geometrique en terme de pose 3D et de deformations, et les residus correspondant (erreurs entre l'observation courante et le modele d'apparence du visage a suivre). La relation de dependance entre les deux ensembles de variables est decrite a l'aide d'une analyse canon...

Research paper thumbnail of Virtual Sensor Systems and Methods

Research paper thumbnail of A linear estimation method for 3D pose and facial animation tracking

2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007

This paper presents an approach that incorporates Canonical Correlation Analysis (CCA) for monocu... more This paper presents an approach that incorporates Canonical Correlation Analysis (CCA) for monocular 3D face pose and facial animation estimation. The CCA is used to find the dependency between texture residuals and 3D face pose and facial gesture. The texture residuals are obtained from observed raw brightness shape-free 2D image patches that we build by means of a parameterized 3D geometric face model. This method is used to correctly estimate the pose of the face and the model's animation parameters controlling the lip, eyebrow and eye movements (encoded in 15 parameters). Extensive experiments on tracking faces in long real video sequences show the effectiveness of the proposed method and the value of using CCA in the tracking context.

Research paper thumbnail of Local or Global 3D Face and Facial Feature Tracker

2007 IEEE International Conference on Image Processing, 2007

We present in this paper a solution for 3D face and facial feature tracking using canonical corre... more We present in this paper a solution for 3D face and facial feature tracking using canonical correlation analysis and a 3D geometric model. This model is controlled with 17 parameters (6 for the 3D pose, and 11 for facial animation), and is used to crop out reference 2D shape free texture maps from the incoming input frames. Model parameters are updated via image registration in the texture map space. For registration, we use CCA to learn and exploit the dependency between texture residuals and model parameter corrections. We compare tracking results using two kinds of texture maps: one local (image patches around selected vertices of the 3D model), and one global (the whole image patch under the 3D model). Experiments evaluating the effectiveness on the approaches are reported.

Research paper thumbnail of Estimation linéaire de la pose tridimensionnelle d'un visage et de ses actions faciales

... Estimation linéaire de la pose tridimensionnelle d'un visage et de ses actions faciales.... more ... Estimation linéaire de la pose tridimensionnelle d'un visage et de ses actions faciales. José Alonso Ybanez Zepeda 1. (17/06/2008). ... Soumis le : Vendredi 10 Avril 2009, 08:00:00. Dernière modification le : Vendredi 13 Mai 2011, 13:00:00.

Research paper thumbnail of Face tracking using canonical correlation analysis

This paper presents an approach that incorporates canonical correlation analysis for monocular 3D... more This paper presents an approach that incorporates canonical correlation analysis for monocular 3D face track- ing as a rigid object. It also provides the comparison between the linear and the non linear version (kernel) of the CCA. The 3D pose of the face is estimated from observed raw brightness shape-free 2D image patches. A parameterized geometric face model is adopted to crop out and to normalize the shape of patches of interest from video frames. Starting from a face model fitted to an observed hum an face, the relation between a set of perturbed pose parameters of the face model and the associated image patches is learned using CCA or KCCA. This knowledge is then used to estimate the correction to be added to the pose of the face from an observed patch in the current frame. Experimental results on tracking faces in long video sequences show the effectiveness of the two proposed methods.

Research paper thumbnail of Linear tracking of pose and facial features

Proceedings of the IAPR Conference on …

We present an approach for simultaneous monocular 3D face pose and facial animation tracking. The... more We present an approach for simultaneous monocular 3D face pose and facial animation tracking. The pose and facial features are estimated from observed raw brightness shape-free 2D image patches. A parameterized 3D face model is adopted to crop out and to normalize the shape of patches from video frames. Starting from the face model aligned on an observed human face, we learn the relation between a set of perturbed parameters of the face model and the associated image patches using a Canonical Correlation Analysis. This knowledge, obtained from an observed patch in the current frame, is used to estimate the correction to be added to the pose of the face and to the animation parameters controlling the lips, eyebrows and eyes. Ground truth data is used to evaluate both the pose and facial animation tracking efficiency in long real video sequences.

Research paper thumbnail of Local or Global 3D Face and Facial Feature Tracker

Image Processing, IEEE International Conference, 2007

We present in this paper a solution for 3D face and facial fea- ture tracking using canonical cor... more We present in this paper a solution for 3D face and facial fea- ture tracking using canonical correlation analysis and a 3D geometric model. This model is controlled with 17 parame- ters (6 for the 3D pose, and 11 for facial animation), and is used to crop out reference 2D shape free texture maps from the incoming input frames. Model

Research paper thumbnail of Proxy Clouds for RGB-D Stream Processing: An Insight

HAL (Le Centre pour la Communication Scientifique Directe), Jul 1, 2017

Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeli... more Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through gaming. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. In order to cope with all these issues, we present Proxy Clouds, a multiplanar superstructure for unified real-time processing of RGB-D data. By generating and updating through time a single set of rich statistics parameterized over planar proxies from raw RGB-D data, several processing primitives can be applied to improve the quality of the RGB-D stream on-the-fly or lighten further operations. We illustrate the use of Proxy Clouds on several applications, including noise and temporal flickering removal, hole filling, resampling, color processing and compression. We present experiments performed with our framework in indoor scenes of different natures captured with a consumer depth sensor.

Research paper thumbnail of Proxy clouds for RGB-D stream processing

ACM SIGGRAPH 2017 Talks, 2017

Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeli... more Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through gaming. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. In order to cope with all these issues, we present Proxy Clouds, a multiplanar superstructure for unified real-time processing of RGB-D data. By generating and updating through time a single set of rich statistics parameterized over planar proxies from raw RGB-D data, several processing primitives can be applied to improve the quality of the RGB-D stream on-the-fly or lighten further operations. We illustrate the use of Proxy Clouds on several applications, including noise and temporal flickering removal, hole filling, resampling, color processing and compression. We present experiments performed with our framework in indoor scenes of different natures captured with a consumer depth sensor.

Research paper thumbnail of A Survey of Simple Geometric Primitives Detection Methods for Captured 3D Data

Computer Graphics Forum, 2018

The amount of captured 3D data is continuously increasing, with the democratization of consumer d... more The amount of captured 3D data is continuously increasing, with the democratization of consumer depth cameras, the development of modern multi‐view stereo capture setups and the rise of single‐view 3D capture based on machine learning. The analysis and representation of this ever growing volume of 3D data, often corrupted with acquisition noise and reconstruction artefacts, is a serious challenge at the frontier between computer graphics and computer vision. To that end, segmentation and optimization are crucial analysis components of the shape abstraction process, which can themselves be greatly simplified when performed on lightened geometric formats. In this survey, we review the algorithms which extract simple geometric primitives from raw dense 3D data. After giving an introduction to these techniques, from the acquisition modality to the underlying theoretical concepts, we propose an application‐oriented characterization, designed to help select an appropriate method based on ...

Research paper thumbnail of Proxy Clouds for RGB-D Stream Processing: A Preview

Research paper thumbnail of Plane Pair Matching for Efficient 3D View Registration

ArXiv, 2020

We present a novel method to estimate the motion matrix between overlapping pairs of 3D views in ... more We present a novel method to estimate the motion matrix between overlapping pairs of 3D views in the context of indoor scenes. We use the Manhattan world assumption to introduce lightweight geometric constraints under the form of planes into the problem, which reduces complexity by taking into account the structure of the scene. In particular, we define a stochastic framework to categorize planes as vertical or horizontal and parallel or non-parallel. We leverage this classification to match pairs of planes in overlapping views with point-of-view agnostic structural metrics. We propose to split the motion computation using the classification and estimate separately the rotation and translation of the sensor, using a quadric minimizer. We validate our approach on a toy example and present quantitative experiments on a public RGB-D dataset, comparing against recent state-of-the-art methods. Our evaluation shows that planar constraints only add low computational overhead while improvin...

Research paper thumbnail of Geometric Proxies for Live RGB-D Stream Enhancement and Consolidation

ArXiv, 2020

We propose a geometric superstructure for unified real-time processing of RGB-D data. Modern RGB-... more We propose a geometric superstructure for unified real-time processing of RGB-D data. Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through augmented reality. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. Our approach consists in generating and updating through time a single set of compact local statistics parameterized over detected geometric proxies, which are fed from raw RGB-D data. Our proxies provide several processing primitives, which improve the quality of the RGB-D stream on the fly or lighten further operations. Experimental results confirm that our lightweight analysis framework copes well with embedded execution as well as moderate memory and computational capabilities compared to state-of-the-art methods. Processing RGB-D data with our proxies allows noise and temporal flickering removal, hole filling and re...

Research paper thumbnail of Proxy Clouds for Live RGB-D Stream Processing and Consolidation

Computer Vision – ECCV 2018, 2018

We propose a new multiplanar superstructure for unified real-time processing of RGB-D data. Moder... more We propose a new multiplanar superstructure for unified real-time processing of RGB-D data. Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through augmented reality. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. Our approach, named Proxy Clouds, consists in generating and updating through time a single set of compact local statistics parameterized over detected planar proxies, which are fed from raw RGB-D data. Proxy Clouds provide several processing primitives, which improve the quality of the RGB-D stream on-the-fly or lighten further operations. Experimental results confirm that our light weight analysis framework copes well with embedded execution as well as moderate memory and computational capabilities compared to stateof-the-art methods. Processing of RGB-D data with Proxy Clouds includes noise and temporal flickering removal, hole filling and resampling. As a substitute of the observed scene, our proxy cloud can additionally be applied to compression and scene reconstruction. We present experiments performed with our framework in indoor scenes of different natures within a recent open RGB-D dataset.

Research paper thumbnail of Estimation linéaire de la pose tridimensionnelle d'un visage et de ses actions faciales

L'objectif de la these est le suivi de visages et d'animations faciales dans des sequence... more L'objectif de la these est le suivi de visages et d'animations faciales dans des sequences video. Apres avoir introduit le sujet, nous proposerons une premiere methode permettant de suivre la pose 3D et les animations faciales du visage detecte au debut d'une sequence video. Nous presenterons ensuite une methode permettant d'initialiser le suivi du visage, en estimant la pose et la forme d'un visage inconnu, a partir d'une base de visages vus de face. Pour atteindre ces deux objectifs (initialisation et suivi), differentes approches seront decrites, utilisant un modele geometrique de visage et une mise en correspondance de deux ensembles de variables: les perturbations du modele geometrique en terme de pose 3D et de deformations, et les residus correspondant (erreurs entre l'observation courante et le modele d'apparence du visage a suivre). La relation de dependance entre les deux ensembles de variables est decrite a l'aide d'une analyse canon...

Research paper thumbnail of Virtual Sensor Systems and Methods

Research paper thumbnail of A linear estimation method for 3D pose and facial animation tracking

2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007

This paper presents an approach that incorporates Canonical Correlation Analysis (CCA) for monocu... more This paper presents an approach that incorporates Canonical Correlation Analysis (CCA) for monocular 3D face pose and facial animation estimation. The CCA is used to find the dependency between texture residuals and 3D face pose and facial gesture. The texture residuals are obtained from observed raw brightness shape-free 2D image patches that we build by means of a parameterized 3D geometric face model. This method is used to correctly estimate the pose of the face and the model's animation parameters controlling the lip, eyebrow and eye movements (encoded in 15 parameters). Extensive experiments on tracking faces in long real video sequences show the effectiveness of the proposed method and the value of using CCA in the tracking context.

Research paper thumbnail of Local or Global 3D Face and Facial Feature Tracker

2007 IEEE International Conference on Image Processing, 2007

We present in this paper a solution for 3D face and facial feature tracking using canonical corre... more We present in this paper a solution for 3D face and facial feature tracking using canonical correlation analysis and a 3D geometric model. This model is controlled with 17 parameters (6 for the 3D pose, and 11 for facial animation), and is used to crop out reference 2D shape free texture maps from the incoming input frames. Model parameters are updated via image registration in the texture map space. For registration, we use CCA to learn and exploit the dependency between texture residuals and model parameter corrections. We compare tracking results using two kinds of texture maps: one local (image patches around selected vertices of the 3D model), and one global (the whole image patch under the 3D model). Experiments evaluating the effectiveness on the approaches are reported.

Research paper thumbnail of Estimation linéaire de la pose tridimensionnelle d'un visage et de ses actions faciales

... Estimation linéaire de la pose tridimensionnelle d'un visage et de ses actions faciales.... more ... Estimation linéaire de la pose tridimensionnelle d'un visage et de ses actions faciales. José Alonso Ybanez Zepeda 1. (17/06/2008). ... Soumis le : Vendredi 10 Avril 2009, 08:00:00. Dernière modification le : Vendredi 13 Mai 2011, 13:00:00.

Research paper thumbnail of Face tracking using canonical correlation analysis

This paper presents an approach that incorporates canonical correlation analysis for monocular 3D... more This paper presents an approach that incorporates canonical correlation analysis for monocular 3D face track- ing as a rigid object. It also provides the comparison between the linear and the non linear version (kernel) of the CCA. The 3D pose of the face is estimated from observed raw brightness shape-free 2D image patches. A parameterized geometric face model is adopted to crop out and to normalize the shape of patches of interest from video frames. Starting from a face model fitted to an observed hum an face, the relation between a set of perturbed pose parameters of the face model and the associated image patches is learned using CCA or KCCA. This knowledge is then used to estimate the correction to be added to the pose of the face from an observed patch in the current frame. Experimental results on tracking faces in long video sequences show the effectiveness of the two proposed methods.

Research paper thumbnail of Linear tracking of pose and facial features

Proceedings of the IAPR Conference on …

We present an approach for simultaneous monocular 3D face pose and facial animation tracking. The... more We present an approach for simultaneous monocular 3D face pose and facial animation tracking. The pose and facial features are estimated from observed raw brightness shape-free 2D image patches. A parameterized 3D face model is adopted to crop out and to normalize the shape of patches from video frames. Starting from the face model aligned on an observed human face, we learn the relation between a set of perturbed parameters of the face model and the associated image patches using a Canonical Correlation Analysis. This knowledge, obtained from an observed patch in the current frame, is used to estimate the correction to be added to the pose of the face and to the animation parameters controlling the lips, eyebrows and eyes. Ground truth data is used to evaluate both the pose and facial animation tracking efficiency in long real video sequences.

Research paper thumbnail of Local or Global 3D Face and Facial Feature Tracker

Image Processing, IEEE International Conference, 2007

We present in this paper a solution for 3D face and facial fea- ture tracking using canonical cor... more We present in this paper a solution for 3D face and facial fea- ture tracking using canonical correlation analysis and a 3D geometric model. This model is controlled with 17 parame- ters (6 for the 3D pose, and 11 for facial animation), and is used to crop out reference 2D shape free texture maps from the incoming input frames. Model