Felipe Lumbreras | Universitat Autònoma de Barcelona (original) (raw)
Papers by Felipe Lumbreras
IEEE Journal of Selected Topics in Signal Processing, 2012
Proceedings of the 9th International Conference on Computer Vision Theory and Applications, 2014
This paper presents a novel feature point descriptor for the multispectral image case: Far-Infrar... more This paper presents a novel feature point descriptor for the multispectral image case: Far-Infrared and Visible Spectrum images. It allows matching interest points on images of the same scene but acquired in different spectral bands. Initially, points of interest are detected on both images through a SIFT-like based scale space representation. Then, these points are characterized using an Edge Oriented Histogram (EOH) descriptor. Finally, points of interest from multispectral images are matched by finding nearest couples using the information from the descriptor. The provided experimental results and comparisons with similar methods show both the validity of the proposed approach as well as the improvements it offers with respect to the current state-of-the-art.
Lecture Notes in Computer Science, 2019
In this work, we introduce a novel energy-based framework that addresses the challenging problem ... more In this work, we introduce a novel energy-based framework that addresses the challenging problem of 3D reconstruction of facial hair from a single RGB image. To this end, we identify hair pixels over the image via texture analysis and then determine individual hair fibers that are modeled by means of a parametric hair model based on 3D helixes. We propose to minimize an energy composed of several terms, in order to adapt the hair parameters that better fit the image detections. The final hairs respond to the resulting fibers after a post-processing step where we encourage further realism. The resulting approach generates realistic facial hair fibers from solely an RGB image without assuming any training data nor user interaction. We provide an experimental evaluation on real-world pictures where several facial hair styles and image conditions are observed, showing consistent results and establishing a comparison with respect to competing approaches.
Expert Systems with Applications
Abstract Kidney stone formation is a common disease and the incidence rate is constantly increasi... more Abstract Kidney stone formation is a common disease and the incidence rate is constantly increasing worldwide. It has been shown that the classification of kidney stones can lead to an important reduction of the recurrence rate. The classification of kidney stones by human experts on the basis of certain visual color and texture features is one of the most employed techniques. However, the knowledge of how to analyze kidney stones is not widespread, and the experts learn only after being trained on a large number of samples of the different classes. In this paper we describe a new device specifically designed for capturing images of expelled kidney stones, and a method to learn and apply the experts knowledge with regard to their classification. We show that with off the shelf components, a carefully selected set of features and a state of the art classifier it is possible to automate this difficult task to a good degree. We report results on a collection of 454 kidney stones, achieving an overall accuracy of 63% for a set of eight classes covering almost all of the kidney stones taxonomy. Moreover, for more than 80% of samples the real class is the first or the second most probable class according to the system, being then the patient recommendations for the two top classes similar. This is the first attempt towards the automatic visual classification of kidney stones, and based on the current results we foresee better accuracies with the increase of the dataset size.
This paper describes the ICAR system, an application for automatic reading of identity cards and ... more This paper describes the ICAR system, an application for automatic reading of identity cards and passports. The system acquires the image of the document by a flatbed scanner and recognizes the type of the document among a set of predefined models using color information. Textual fields are located in the image by a connected component analysis and identified in terms
Image Analysis and Recognition, 2006
Abstract. Several factorization techniques have been proposed for tackling the Structure from Mot... more Abstract. Several factorization techniques have been proposed for tackling the Structure from Motion problem. Most of them provide a good solution, while the amount of missing and noisy data is within an acceptable ratio. Focussing on this problem, we propose to use an incremenal multiresolution scheme, with classical factorization techniques. Information recovered following a coarse-to-fine strategy is used for both, filling in the missing entries of the input matrix and denoising original data.
Continuous innovations in automotive lighting technology pose the problem of how to assess new he... more Continuous innovations in automotive lighting technology pose the problem of how to assess new headlights systems. For car manufacturers, assessment is mostly relative: given a headlights system to be tested, how does it compare with another, maybe from a different supplier, in terms of features such as light intensity, homogeneity or reach? This comparison is best performed dynamically, asking experts actually to drive along a certain testing track to write down later the visual impressions that they remember. However, this procedure suffers from several drawbacks: comparisons cannot be repeated, are not retrospective, and cannot be properly shared with other people since the only record is a paper form. To overcome these, it is proposed to record, for each headlights system, a video sequence of what the driver sees with a camera attached to the windshield screen. The problem becomes now how to compare a pair of such sequences. Two issues must be addressed: the temporal alignment or synchronization of the two sequences, and then the spatial alignment or registration of all the corresponding frames. In this paper a semiautomatic but fast procedure for the former, and an automatic method for the later are proposed. In addition, an alternative to the joint visualization of corresponding frames called the bird's-eye view transform is explored, and a simple fusion technique for better visualization of the headlights differences in two sequences is proposed. Results are provided for a number of headlights with different light sources and from several vehicle brands, in the form of both still images and video sequences.
Road detection is a vital task for the development of autonomous vehicles. The knowledge of the f... more Road detection is a vital task for the development of autonomous vehicles. The knowledge of the free road surface ahead of the target vehicle can be used for autonomous driving, road departure warning, as well as to support advanced driver assistance systems like vehicle or pedestrian detection. Using vision to detect the road has several advantages in front of other
This paper summarizes the recent work developed to detect and track vehicles in front of a mobile... more This paper summarizes the recent work developed to detect and track vehicles in front of a mobile platform, estimating their 3D state from images provided by a monochrome camera hosted on the mobile platform. This task has required the study of different topics, namely the detection of vehicles from images, the spatio-temporal analysis of these detections to verify the presence of real vehicles and estimate their dinamics, and finally the posterior vehicle tracking along time. These three topics constitute the basic building blocks of a 3D model based vehicle tracking framework presented in this paper.
Journal of WSCG
This paper introduces a method to obtain a detailed 3D reconstruction of facial skin from a singl... more This paper introduces a method to obtain a detailed 3D reconstruction of facial skin from a single RGB image. To this end, we propose the exclusive use of an input image without requiring any information about the observed material nor training data to model the wrinkle properties. They are detected and characterized directly from the image via a simple and effective parametric model, determining several features such as location, orientation, width, and height. With these ingredients, we propose to minimize a photometric error to retrieve the final detailed 3D map, which is initialized by current techniques based on deep learning. In contrast with other approaches, we only require estimating a depth parameter, making our approach fast and intuitive. Extensive experimental evaluation is presented in a wide variety of synthetic and real images, including different skin properties and facial expressions. In all cases, our method outperforms the current approaches regarding 3D reconstruction accuracy, providing striking results for both large and fine wrinkles.
Lecture Notes in Computer Science, 2016
In this paper, we propose a new data-driven approach for binary descriptor selection. In order to... more In this paper, we propose a new data-driven approach for binary descriptor selection. In order to draw a clear analysis of common designs, we present a general information-theoretic selection paradigm. It encompasses several standard binary descriptor construction schemes, including a recent state-of-the-art one named BOLD. We pursue the same endeavor to increase the stability of the produced descriptors with respect to rotations. To achieve this goal, we have designed a novel offline selection criterion which is better adapted to the online matching procedure. The effectiveness of our approach is demonstrated on two standard datasets, where our descriptor is compared to BOLD and to several classical descriptors. In particular, it emerges that our approach can reproduce equivalent if not better performance as BOLD while relying on twice shorter descriptors. Such an improvement can be influential for real-time applications.
Proceedings Icip International Conference on Image Processing, Oct 1, 2008
Photometric stereo aims at finding the surface normal and reflectance at every point of an object... more Photometric stereo aims at finding the surface normal and reflectance at every point of an object from a set of images obtained under different lighting conditions. The obtained intensity image data are stacked into a matrix that can be approximated by a low-dimensional linear subspace, under the Lambertian model. The current paper proposes to use an adaptation of the Alternation technique to tackle this problem when the images contain missing data, which correspond to pixels in shadow and saturated regions. Experimental results considering both synthetic and real images show the good performance of the proposed Alternation-based strategy.
Proceedings 15th International Conference on Pattern Recognition Icpr 2000, Feb 1, 2000
... Albert Pujol, Felipe Lumbreras, Xavier Varona, Juan Jost Villanueva Computer Vision Center an... more ... Albert Pujol, Felipe Lumbreras, Xavier Varona, Juan Jost Villanueva Computer Vision Center and Departament d'lnformatica Edifici 0, Universitat Aut6noma de Barcelona 081 93 Cerdanyola ... This paper presents a robust architecture for solving this problem in static images. ...
Proceedings Icip International Conference on Image Processing, Sep 1, 2010
This paper presents the combined use of gradient and mutual information for infrared and intensit... more This paper presents the combined use of gradient and mutual information for infrared and intensity templates matching. We propose to joint: (i) feature matching in a multiresolution context and (ii) information propagation through scale-space representations. Our method consists in combining mutual information with a shape descriptor based on gradient, and propagate them following a coarseto-fine strategy. The main contributions of this work are: to offer a theoretical formulation towards a multimodal stereo matching; to show that gradient and mutual information can be reinforced while they are propagated between consecutive levels; and to show that they are valid cost functions in multimodal template matchings. Comparisons are presented showing the improvements and viability of the proposed approach.
Proceedings. 2005 IEEE Intelligent Transportation Systems, 2005., 2005
Detection of lane markings based on a camera sensor can be a low cost solution to lane departure ... more Detection of lane markings based on a camera sensor can be a low cost solution to lane departure warning and lateral control. However, reliable detection is difficult due to cast shadows, vehicles occluding the marks, wear, vehicle motion, etc. The contribution of this paper is twofold. Firstly, we propose to explore another low-level image descriptor, namely, the ridgeness, instead of the gradient magnitude with the aim of getting a more reliable lane marking detection under adverse circumstances. Besides, the proposed measure comes with an associated orientation which is less noisy than the gradient one. Secondly, we have adapted RANSAC, a generic robust estimation method, to fit a parametric model to the image lane lines using both ridgeness and orientation as input data. In short, in this paper a better feature type and a robust fitting method are proposed, which contribute to improve the lane lines detection reliability, and still achieving real-time.
Lecture Notes in Computer Science, 2007
Detection of lane markings based on a camera sensor can be a low cost solution to lane departure ... more Detection of lane markings based on a camera sensor can be a low cost solution to lane departure and curve over speed warning. A number of methods and implementations have been reported in the literature. However, reliable detection is still an issue due to cast shadows, wearied and occluded markings, variable ambient lighting conditions etc. We focus on increasing the reliability of detection in two ways. Firstly, we employ a different image feature other than the commonly used edges: ridges, which we claim is better suited to this problem. Secondly, we have adapted RANSAC, a generic robust estimation method, to fit a parametric model of a pair or lane lines to the image features, based on both ridgeness and ridge orientation. In addition this fitting is performed for the left and right lane lines simultaneously, thus enforcing a consistent result. We have quantitatively assessed it on synthetic but realistic video sequences for which road geometry and vehicle trajectory ground truth are known.
International Journal of Automotive Technology, 2010
4 Detection of lane markings based on a camera sensor can be a 5 low cost solution to lane depart... more 4 Detection of lane markings based on a camera sensor can be a 5 low cost solution to lane departure and curve over speed warning. A 6 number of methods and implementations have been reported in the 7 literature. However, reliable detection is still an issue due to cast 8 shadows, wearied and occluded markings, variable ambient lighting 9 conditions etc. We focus on increasing the reliability of detection in 10 two ways. Firstly, we employ a different image feature other than 11 the commonly used edges: ridges, which we claim is better suited to 12 this problem. Secondly, we have adapted RANSAC, a generic robust 13 estimation method, to fit a parametric model of a pair or lane lines 14 to the image features, based on both ridgeness and ridge orientation. 15 In addition this fitting is performed for the left and right lane lines 16 simultaneously, thus enforcing a consistent result. Four measures of 17 interest with regard several driver assistance applications are directly 18 computed from the fitted parametric model at each frame: vehicle 19 yaw angle and lateral offset with regard the lane medial axis, and 20 lane width and curvature. We have qualitatively assessed our method 21 in video sequences captured on several road types and under very 22 different lighting conditions. Also, we have quantitatively assessed it 23 on synthetic but realistic video sequences for which road geometry and 24 vehicle trajectory ground truth are known.
IEEE Journal of Selected Topics in Signal Processing, 2012
Proceedings of the 9th International Conference on Computer Vision Theory and Applications, 2014
This paper presents a novel feature point descriptor for the multispectral image case: Far-Infrar... more This paper presents a novel feature point descriptor for the multispectral image case: Far-Infrared and Visible Spectrum images. It allows matching interest points on images of the same scene but acquired in different spectral bands. Initially, points of interest are detected on both images through a SIFT-like based scale space representation. Then, these points are characterized using an Edge Oriented Histogram (EOH) descriptor. Finally, points of interest from multispectral images are matched by finding nearest couples using the information from the descriptor. The provided experimental results and comparisons with similar methods show both the validity of the proposed approach as well as the improvements it offers with respect to the current state-of-the-art.
Lecture Notes in Computer Science, 2019
In this work, we introduce a novel energy-based framework that addresses the challenging problem ... more In this work, we introduce a novel energy-based framework that addresses the challenging problem of 3D reconstruction of facial hair from a single RGB image. To this end, we identify hair pixels over the image via texture analysis and then determine individual hair fibers that are modeled by means of a parametric hair model based on 3D helixes. We propose to minimize an energy composed of several terms, in order to adapt the hair parameters that better fit the image detections. The final hairs respond to the resulting fibers after a post-processing step where we encourage further realism. The resulting approach generates realistic facial hair fibers from solely an RGB image without assuming any training data nor user interaction. We provide an experimental evaluation on real-world pictures where several facial hair styles and image conditions are observed, showing consistent results and establishing a comparison with respect to competing approaches.
Expert Systems with Applications
Abstract Kidney stone formation is a common disease and the incidence rate is constantly increasi... more Abstract Kidney stone formation is a common disease and the incidence rate is constantly increasing worldwide. It has been shown that the classification of kidney stones can lead to an important reduction of the recurrence rate. The classification of kidney stones by human experts on the basis of certain visual color and texture features is one of the most employed techniques. However, the knowledge of how to analyze kidney stones is not widespread, and the experts learn only after being trained on a large number of samples of the different classes. In this paper we describe a new device specifically designed for capturing images of expelled kidney stones, and a method to learn and apply the experts knowledge with regard to their classification. We show that with off the shelf components, a carefully selected set of features and a state of the art classifier it is possible to automate this difficult task to a good degree. We report results on a collection of 454 kidney stones, achieving an overall accuracy of 63% for a set of eight classes covering almost all of the kidney stones taxonomy. Moreover, for more than 80% of samples the real class is the first or the second most probable class according to the system, being then the patient recommendations for the two top classes similar. This is the first attempt towards the automatic visual classification of kidney stones, and based on the current results we foresee better accuracies with the increase of the dataset size.
This paper describes the ICAR system, an application for automatic reading of identity cards and ... more This paper describes the ICAR system, an application for automatic reading of identity cards and passports. The system acquires the image of the document by a flatbed scanner and recognizes the type of the document among a set of predefined models using color information. Textual fields are located in the image by a connected component analysis and identified in terms
Image Analysis and Recognition, 2006
Abstract. Several factorization techniques have been proposed for tackling the Structure from Mot... more Abstract. Several factorization techniques have been proposed for tackling the Structure from Motion problem. Most of them provide a good solution, while the amount of missing and noisy data is within an acceptable ratio. Focussing on this problem, we propose to use an incremenal multiresolution scheme, with classical factorization techniques. Information recovered following a coarse-to-fine strategy is used for both, filling in the missing entries of the input matrix and denoising original data.
Continuous innovations in automotive lighting technology pose the problem of how to assess new he... more Continuous innovations in automotive lighting technology pose the problem of how to assess new headlights systems. For car manufacturers, assessment is mostly relative: given a headlights system to be tested, how does it compare with another, maybe from a different supplier, in terms of features such as light intensity, homogeneity or reach? This comparison is best performed dynamically, asking experts actually to drive along a certain testing track to write down later the visual impressions that they remember. However, this procedure suffers from several drawbacks: comparisons cannot be repeated, are not retrospective, and cannot be properly shared with other people since the only record is a paper form. To overcome these, it is proposed to record, for each headlights system, a video sequence of what the driver sees with a camera attached to the windshield screen. The problem becomes now how to compare a pair of such sequences. Two issues must be addressed: the temporal alignment or synchronization of the two sequences, and then the spatial alignment or registration of all the corresponding frames. In this paper a semiautomatic but fast procedure for the former, and an automatic method for the later are proposed. In addition, an alternative to the joint visualization of corresponding frames called the bird's-eye view transform is explored, and a simple fusion technique for better visualization of the headlights differences in two sequences is proposed. Results are provided for a number of headlights with different light sources and from several vehicle brands, in the form of both still images and video sequences.
Road detection is a vital task for the development of autonomous vehicles. The knowledge of the f... more Road detection is a vital task for the development of autonomous vehicles. The knowledge of the free road surface ahead of the target vehicle can be used for autonomous driving, road departure warning, as well as to support advanced driver assistance systems like vehicle or pedestrian detection. Using vision to detect the road has several advantages in front of other
This paper summarizes the recent work developed to detect and track vehicles in front of a mobile... more This paper summarizes the recent work developed to detect and track vehicles in front of a mobile platform, estimating their 3D state from images provided by a monochrome camera hosted on the mobile platform. This task has required the study of different topics, namely the detection of vehicles from images, the spatio-temporal analysis of these detections to verify the presence of real vehicles and estimate their dinamics, and finally the posterior vehicle tracking along time. These three topics constitute the basic building blocks of a 3D model based vehicle tracking framework presented in this paper.
Journal of WSCG
This paper introduces a method to obtain a detailed 3D reconstruction of facial skin from a singl... more This paper introduces a method to obtain a detailed 3D reconstruction of facial skin from a single RGB image. To this end, we propose the exclusive use of an input image without requiring any information about the observed material nor training data to model the wrinkle properties. They are detected and characterized directly from the image via a simple and effective parametric model, determining several features such as location, orientation, width, and height. With these ingredients, we propose to minimize a photometric error to retrieve the final detailed 3D map, which is initialized by current techniques based on deep learning. In contrast with other approaches, we only require estimating a depth parameter, making our approach fast and intuitive. Extensive experimental evaluation is presented in a wide variety of synthetic and real images, including different skin properties and facial expressions. In all cases, our method outperforms the current approaches regarding 3D reconstruction accuracy, providing striking results for both large and fine wrinkles.
Lecture Notes in Computer Science, 2016
In this paper, we propose a new data-driven approach for binary descriptor selection. In order to... more In this paper, we propose a new data-driven approach for binary descriptor selection. In order to draw a clear analysis of common designs, we present a general information-theoretic selection paradigm. It encompasses several standard binary descriptor construction schemes, including a recent state-of-the-art one named BOLD. We pursue the same endeavor to increase the stability of the produced descriptors with respect to rotations. To achieve this goal, we have designed a novel offline selection criterion which is better adapted to the online matching procedure. The effectiveness of our approach is demonstrated on two standard datasets, where our descriptor is compared to BOLD and to several classical descriptors. In particular, it emerges that our approach can reproduce equivalent if not better performance as BOLD while relying on twice shorter descriptors. Such an improvement can be influential for real-time applications.
Proceedings Icip International Conference on Image Processing, Oct 1, 2008
Photometric stereo aims at finding the surface normal and reflectance at every point of an object... more Photometric stereo aims at finding the surface normal and reflectance at every point of an object from a set of images obtained under different lighting conditions. The obtained intensity image data are stacked into a matrix that can be approximated by a low-dimensional linear subspace, under the Lambertian model. The current paper proposes to use an adaptation of the Alternation technique to tackle this problem when the images contain missing data, which correspond to pixels in shadow and saturated regions. Experimental results considering both synthetic and real images show the good performance of the proposed Alternation-based strategy.
Proceedings 15th International Conference on Pattern Recognition Icpr 2000, Feb 1, 2000
... Albert Pujol, Felipe Lumbreras, Xavier Varona, Juan Jost Villanueva Computer Vision Center an... more ... Albert Pujol, Felipe Lumbreras, Xavier Varona, Juan Jost Villanueva Computer Vision Center and Departament d'lnformatica Edifici 0, Universitat Aut6noma de Barcelona 081 93 Cerdanyola ... This paper presents a robust architecture for solving this problem in static images. ...
Proceedings Icip International Conference on Image Processing, Sep 1, 2010
This paper presents the combined use of gradient and mutual information for infrared and intensit... more This paper presents the combined use of gradient and mutual information for infrared and intensity templates matching. We propose to joint: (i) feature matching in a multiresolution context and (ii) information propagation through scale-space representations. Our method consists in combining mutual information with a shape descriptor based on gradient, and propagate them following a coarseto-fine strategy. The main contributions of this work are: to offer a theoretical formulation towards a multimodal stereo matching; to show that gradient and mutual information can be reinforced while they are propagated between consecutive levels; and to show that they are valid cost functions in multimodal template matchings. Comparisons are presented showing the improvements and viability of the proposed approach.
Proceedings. 2005 IEEE Intelligent Transportation Systems, 2005., 2005
Detection of lane markings based on a camera sensor can be a low cost solution to lane departure ... more Detection of lane markings based on a camera sensor can be a low cost solution to lane departure warning and lateral control. However, reliable detection is difficult due to cast shadows, vehicles occluding the marks, wear, vehicle motion, etc. The contribution of this paper is twofold. Firstly, we propose to explore another low-level image descriptor, namely, the ridgeness, instead of the gradient magnitude with the aim of getting a more reliable lane marking detection under adverse circumstances. Besides, the proposed measure comes with an associated orientation which is less noisy than the gradient one. Secondly, we have adapted RANSAC, a generic robust estimation method, to fit a parametric model to the image lane lines using both ridgeness and orientation as input data. In short, in this paper a better feature type and a robust fitting method are proposed, which contribute to improve the lane lines detection reliability, and still achieving real-time.
Lecture Notes in Computer Science, 2007
Detection of lane markings based on a camera sensor can be a low cost solution to lane departure ... more Detection of lane markings based on a camera sensor can be a low cost solution to lane departure and curve over speed warning. A number of methods and implementations have been reported in the literature. However, reliable detection is still an issue due to cast shadows, wearied and occluded markings, variable ambient lighting conditions etc. We focus on increasing the reliability of detection in two ways. Firstly, we employ a different image feature other than the commonly used edges: ridges, which we claim is better suited to this problem. Secondly, we have adapted RANSAC, a generic robust estimation method, to fit a parametric model of a pair or lane lines to the image features, based on both ridgeness and ridge orientation. In addition this fitting is performed for the left and right lane lines simultaneously, thus enforcing a consistent result. We have quantitatively assessed it on synthetic but realistic video sequences for which road geometry and vehicle trajectory ground truth are known.
International Journal of Automotive Technology, 2010
4 Detection of lane markings based on a camera sensor can be a 5 low cost solution to lane depart... more 4 Detection of lane markings based on a camera sensor can be a 5 low cost solution to lane departure and curve over speed warning. A 6 number of methods and implementations have been reported in the 7 literature. However, reliable detection is still an issue due to cast 8 shadows, wearied and occluded markings, variable ambient lighting 9 conditions etc. We focus on increasing the reliability of detection in 10 two ways. Firstly, we employ a different image feature other than 11 the commonly used edges: ridges, which we claim is better suited to 12 this problem. Secondly, we have adapted RANSAC, a generic robust 13 estimation method, to fit a parametric model of a pair or lane lines 14 to the image features, based on both ridgeness and ridge orientation. 15 In addition this fitting is performed for the left and right lane lines 16 simultaneously, thus enforcing a consistent result. Four measures of 17 interest with regard several driver assistance applications are directly 18 computed from the fitted parametric model at each frame: vehicle 19 yaw angle and lateral offset with regard the lane medial axis, and 20 lane width and curvature. We have qualitatively assessed our method 21 in video sequences captured on several road types and under very 22 different lighting conditions. Also, we have quantitatively assessed it 23 on synthetic but realistic video sequences for which road geometry and 24 vehicle trajectory ground truth are known.