Luis Puig - Academia.edu (original) (raw)

Papers by Luis Puig

IEEE Systems Journal, 2016

Navigation Assistance for Visually Impaired (NAVI) refers to systems that are able to assist or g... more Navigation Assistance for Visually Impaired (NAVI) refers to systems that are able to assist or guide people with vision loss, ranging from partially sighted to totally blind, by means of sound commands. In this paper, a new system for NAVI is presented based on visual and range information. Instead of using several sensors, we choose one device, a consumer RGB-D camera and take advantage of both range and visual information. In particular, the main contribution is the combination of depth information with image intensities resulting in the robust expansion of the range-based floor segmentation. On the one hand, depth information, which is reliable but limited to a short range, is enhanced with the long-range visual information. On the other hand, the difficult and prone to error image processing is eased and improved with depth information. The proposed system detects and classifies the main structural elements of the scene providing the user with obstacle-free paths in order to navigate safely across unknown scenarios. The proposed system has been tested on a wide variety of scenarios and datasets, giving successful results and showing that the system is robust and works in challenging indoor environments.

Tracking the pose of objects is a relevant topic in computer vision, which potentially allows to ... more Tracking the pose of objects is a relevant topic in computer vision, which potentially allows to recover meaningful information for other applications such as task supervision, robot manipulation or activity recognition. In the last years, RGB-D cameras have been widely adopted for this problem with impressive results. However, there are certain objects whose surface properties or complex shapes prevents the depth sensor from returning good depth measurements, and only color-based methods can be applied. In this work, we show how the depth information of the surroundings of the object can still be useful in the object pose tracking with RGB-D even in this situation. Specifically, we propose using the depth information to handle occlusions in a state of the art region-based object pose tracking algorithm. Experiments with recordings of humans naturally interacting with difficult objects have been performed, showing the advantages of our contribution in several image sequences.

robots.unizar.es

In central hyper-catadioptric systems the 3D lines are projected into conics in the image plane. ... more In central hyper-catadioptric systems the 3D lines are projected into conics in the image plane. The general representation of conics considers five parameters. These conics can be represented by two parameters if the hyper-catadioptric camera calibration is known. In this paper we present a new approach to extract these projected lines that we name catadioptric image lines (CILs). We propose an approximation to the geometric distance from point to conic that combined with the two point representation inside a RANSAC approach allows us to extract the CILs present in a hyper-catadioptric image. We also perform an exhaustive analysis on the elements that can affect the CILs extraction accuracy. Particularly we analyze the effect of calibration errors of the omnidirectional system and the effect of the distribution of the image points on the projected line, i. e. the conic in the image and its length. We present simulations with different hyper-catadioptric systems, and experiments with real images.

Proceedings of the Fifth International Conference on Informatics in Control, Automation and Robotics Service, 2008

This work presents an automatic hybrid matching of central catadioptric and perspective images. I... more This work presents an automatic hybrid matching of central catadioptric and perspective images. It is based on the hybrid epipolar geometry. The goal is to obtain correspondences between an omnidirectional image and a conventional perspective image taken from different points of view. Mixing both kind of images has multiple applications, since an omnidirectional image captures many information and perspective images are the simplest way of acquisition. Scale invariant features with a simple unwrapping are considered to help the initial putative matching. Then a robust technique gives an estimation of the hybrid fundamental matrix, to avoid outliers. Experimental results with real image pairs show the feasibility of that hybrid and difficult matching problem.

Pattern Recognition, 2017

Estimation, recognition, and near-future prediction of 3D trajectories based on their two dimensi... more Estimation, recognition, and near-future prediction of 3D trajectories based on their two dimensional projections available from one camera source is an exceptionally difficult problem due to uncertainty in the trajectories and environment, high dimensionality of the specific trajectory states, lack of enough labeled data and so on. In this article, we propose a solution to solve this problem based on a novel deep learning model dubbed disjunctive factored four-way conditional restricted Boltzmann machine (DFFW-CRBM). Our method improves state-of-the-art deep learning techniques for high dimensional time-series modeling by introducing a novel tensor factorization capable of driving forth order Boltzmann machines to considerably lower energy levels, at no computational costs. DFFW-CRBMs are capable of accurately estimating, recognizing, and performing near-future prediction of three-dimensional trajectories from their 2D projections while requiring limited amount of labeled data. We evaluate our method on both simulated and real-world data, showing its effectiveness in predicting and classifying complex ball trajectories and human activities.

2016 IEEE International Conference on Robotics and Automation (ICRA), 2016

The problem of reconstructing deformable 3D surfaces has been studied in the non-rigid structure ... more The problem of reconstructing deformable 3D surfaces has been studied in the non-rigid structure from motion context, where either tracked points over long sequences or an initial 3D shape are required, and also with piecewise methods, where the deformable surface is modeled as a triangulated mesh, which is fitted to an initial estimation of the 3D surface computed from correspondences in two views. In this paper we present a new scheme to reconstruct deformable surfaces by tracking relevant features that parametrize such deformation. Assuming that an initial 3D shape related to a reference frame is available, we initially match the reference and current frames using visual information. Then, these correspondences are clustered in patches with geometric characteristics in the image domain and 3D space. In order to reduce the number of parameters to be estimated, we explain each cluster using thin-plate splines (TPS) with a minimal number of control points. Then the 3D coordinates of these control points in the deformed surface are estimated using a non-linear least squares approach, deriving on the reconstruction of the full deformed patches. We perform experiments in synthetic and real data of monocular video sequences to validate our approach.

This work focuses on central catadioptric systems, from the early step of calibration to high-lev... more This work focuses on central catadioptric systems, from the early step of calibration to high-level tasks such as 3D information retrieval. The book opens with a thorough introduction to the sphere camera model, along with an analysis of the relation between this model and actual central catadioptric systems. Then, a new approach to calibrate any single-viewpoint catadioptric camera is described. This is followed by an analysis of existing methods for calibrating central omnivision systems, and a detailed examination of hybrid two-view relations that combine images acquired with uncalibrated central catadioptric systems and conventional cameras. In the remaining chapters, the book discusses a new method to compute the scale space of any omnidirectional image acquired with a central catadioptric system, and a technique for computing the orientation of a hand-held omnidirectional catadioptric camera.

In this paper we present a novel approach to perform indoor self-localization using reference omn... more In this paper we present a novel approach to perform indoor self-localization using reference omnidirectional images. We only need one omnidirectional image of the whole scene stored in the robot memory and a conventional uncalibrated on-board camera. We match the omnidirectional image and the conventional images captured by the on-board camera and compute the hybrid epipolar geometry using lifted coordinates and robust techniques. We map the epipole in the reference omnidirectional image to a ground plane through a homography in lifted coordinates also, giving the position of the robot in the planar ground, and its uncertainty. We perform experiments with simulated and real data to show the feasibility of this new self-localization approach.

In central catadioptric systems the 3D lines are projected into conics, actually degenerate conic... more In central catadioptric systems the 3D lines are projected into conics, actually degenerate conics. In this paper we present a new approach to extract the projected lines corresponding to straight lines in the scene and to compute vanishing points from them. Using the internal calibration and two image points we are able to compute the catadioptric image lines analytically. We exploit the presence of parallel lines in man-made environments to compute the dominant vanishing points in the omnidirectional image. In order to obtain the intersection of two of these conics to compute vanishing points we analyze the self-polar triangle common to this pair. With the information contained in the vanishing points we are able to obtain the self-orientation of a handheld catadioptric system. This system can be used in a vertical stabilization system required by autonomous navigation or to rectify images required in applications where the vertical orientation of the catadioptric system is assumed. We test our approach performing vertical and full rectifications in real sequences of images.

2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012

In the last years monocular SLAM has been widely used to obtain highly accurate maps and trajecto... more In the last years monocular SLAM has been widely used to obtain highly accurate maps and trajectory estimations of a moving camera. However, one of the issues of this approach is that, due to the impossibility of the depth being measured in a single image, global scale is not observable and scene and camera motion can only be recovered up to scale. This problem gets aggravated as we deal with larger scenes since it is more likely that scale drift arises between different map portions and their corresponding motion estimates. To compute the absolute scale we need to know some kind of dimension of the scene (e.g., actual size of an element of the scene, velocity of the camera or baseline between two frames) and somehow integrate it in the SLAM estimation. In this paper, we present a method to recover the scale of the scene using an omnidirectional camera mounted on a helmet. The high precision of visual SLAM allows the head vertical oscillation during walking to be perceived in the trajectory estimation. By performing a spectral analysis on the camera vertical displacement, we can measure the step frequency. We relate the step frequency to the speed of the camera by an empirical formula based on biomedical experiments on human walking. This speed measurement is integrated in a particle filter to estimate the current scale factor and the 3D motion estimation with its true scale. We evaluated our approach using image sequences acquired while a person walks. Our experiments show that the proposed approach is able to cope with scale drift.

2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012

Autonomous navigation and recognition of the environment are fundamental abilities for people ext... more Autonomous navigation and recognition of the environment are fundamental abilities for people extensively studied in computer vision and robotics fields. Expansion of low cost wearable sensing provides interesting opportunities for assistance systems that augment people navigation and recognition capabilities. This work presents our wearable omnidirectional vision system and a novel twophase localization approach running on it. It runs stateof-the-art real time visual odometry adapted to catadioptric images augmented with topological-semantic information. The presented approach benefits from using wearable sensors to improve visual odometry results with true scaled solution. The wide field of view of catadioptric vision system used makes features last longer in the field of view and allows more compact location representation which facilitates topological place recognition. Experiments in this paper show promising ego-localization results in realistic settings, providing good true scaled visual odometry estimation and recognition of indoor regions.

Omnidirectional Vision Systems, 2013

The number of calibration methods for central catadioptric has increased in recent years. These m... more The number of calibration methods for central catadioptric has increased in recent years. These methods are based on different camera models and they can either consider the central catadioptric system as a whole or as a separated camera and mirror system. Many times the user requires a versatile calibration solution without spending valuable time in the implementation of a particular method. In this chapter, we review the existing methods designed to calibrate any central omnivision system and analyze their advantages and drawbacks doing a deep comparison using simulated and real data. First we present a classification of such methods showing the most relevant characteristics of each particular method. Then we select the methods available as OpenSource and which do not require a complex pattern or scene. The evaluation protocol of calibration accuracy also considers 3D metric reconstruction combining omnidirectional images. Comparative results are shown and discussed in detail.

SpringerBriefs in Computer Science, 2013

In this chapter, we present a deep analysis of the hybrid two-view relations combining images acq... more In this chapter, we present a deep analysis of the hybrid two-view relations combining images acquired with uncalibrated central catadioptric systems and conventional cameras. We consider both, hybrid fundamental matrices and hybrid planar homographies. These matrices contain useful geometric information. We study three different types of matrices, varying in complexity depending on their capacity to deal with a single or multiple types of central catadioptric systems. The first and simplest one is designed to deal with paracatadioptric systems, the second one and more complex, considers the combination of a perspective camera and any central catadioptric system. The last one is the complete and generic model which is able to deal with any combination of central catadioptric systems. We show that the generic and most complex model sometimes is not the best option when we deal with real images. Simpler models are not as accurate as the complete model in the ideal case, but they provide a better and more accurate behavior in presence of noise, being simpler and requiring less correspondences to be computed. Experiments with synthetic data and real images are performed. With the use of these approaches, we develop the successful hybrid matching between perspective images and hypercatadioptric images using SIFT descriptors.

2011 International Conference on Computer Vision, 2011

In this paper we propose a new approach to compute the scale space of any omnidirectional image a... more In this paper we propose a new approach to compute the scale space of any omnidirectional image acquired with a central catadioptric system. When these cameras are central they are explained using the sphere camera model, which unifies in a single model, conventional, paracatadioptric and hypercatadioptric systems. Scale space is essential in the detection and matching of interest points, in particular scale invariant points based on Laplacian of Gaussians, like the well known SIFT. We combine the sphere camera model and the partial differential equations framework on manifolds, to compute the Laplace-Beltrami (LB) operator which is a second order differential operator required to perform the Gaussian smoothing on catadioptric images. We perform experiments with synthetic and real images to validate the generalization of our approach to any central catadioptric system.

SpringerBriefs in Computer Science, 2013

In this chapter, different types of omnidirectional systems are briefly introduced. Then, we focu... more In this chapter, different types of omnidirectional systems are briefly introduced. Then, we focus on the central catadioptric systems and the model used to deal with this type of systems, the so-called sphere camera model. The projection of points and lines under this model are also explained as well as the relation between this model and the actual catadioptric systems. Later, we introduce the lifted coordinates, which is a tool used to deal with the non linearities present on the sphere camera model. We show two different forms to compute them. The former makes use of the G operator and the latter one uses symmetric matrix equations. Finally, a useful representation of catadioptric systems as Riemannian manifolds is presented.

2010 20th International Conference on Pattern Recognition, 2010

In this work we integrate the Spherical Camera Model for catadioptric systems in a Visual-SLAM ap... more In this work we integrate the Spherical Camera Model for catadioptric systems in a Visual-SLAM application. The Spherical Camera Model is a projection model that unifies central catadioptric and conventional cameras. To integrate this model into the Extended Kalman Filter-based SLAM we require to linearize the direct and the inverse projection. We have performed an initial experimentation with omnidirectional and conventional real sequences including challenging trajectories. The results confirm that the omnidirectional camera gives much better orientation accuracy improving the estimated camera trajectory.

SpringerBriefs in Computer Science, 2013

In this chapter, we present a new calibration technique that is valid for all single-viewpoint ca... more In this chapter, we present a new calibration technique that is valid for all single-viewpoint catadioptric cameras. We are able to represent the projection of 3D points on a catadioptric image linearly with a \\(6\\times 10\\) projection matrix, which uses lifted coordinates for image and 3D points. This projection matrix can be linearly computed from 3D to 2D correspondences (minimum 20 points distributed in three different planes). We show how to decompose it to obtain intrinsic and extrinsic parameters. Moreover, we use this parameter estimation followed by a nonlinear optimization to calibrate various types of cameras. Our results are based on the sphere camera model. We test our method both with simulations and real images, and we analyze the results performing a 3D reconstruction from two omnidirectional images.

IEEE Systems Journal, 2016

robots.unizar.es

Proceedings of the Fifth International Conference on Informatics in Control, Automation and Robotics Service, 2008

Pattern Recognition, 2017

2016 IEEE International Conference on Robotics and Automation (ICRA), 2016

2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012

2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012

Omnidirectional Vision Systems, 2013

SpringerBriefs in Computer Science, 2013

2011 International Conference on Computer Vision, 2011

SpringerBriefs in Computer Science, 2013

2010 20th International Conference on Pattern Recognition, 2010

SpringerBriefs in Computer Science, 2013