Depth from a Light Field Image with Learning-Based Matching Costs (original) (raw)
Related papers
Benchmarking of several disparity estimation algorithms for light field processing
2019
A number of high-quality depth imaged-based rendering (DIBR) pipelines have been developed to reconstruct a 3D scene from several images taken from known camera viewpoints. Due to the specific limitations of each technique, their output is prone to artifacts. Therefore the quality cannot be ensured. To improve the quality of the most critical and challenging image areas, an exhaustive comparison is required. In this paper, we consider three questions of benchmarking the quality performance of eight DIBR techniques on light fields: First, how does the density of original input views affect the quality of the rendered novel views? Second, how does disparity range between adjacent input views impact the quality? Third, how does each technique behave for different object properties? We compared and evaluated the results visually as well as quantitatively (PSNR, SSIM, AD, and VDP2). The results show some techniques outperform others in different disparity ranges. The results also indicate using more views not necessarily results in visually higher quality for all critical image areas. Finally we have shown a comparison for different scene's complexity such as non-Lambertian objects.
Robust and dense depth estimation for light field images
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, 2017
We propose a depth estimation method for light field images. Light field images can be considered as a collection of 2D images taken from different viewpoints arranged in a regular grid. We exploit this configuration and compute the disparity maps between specific pairs of views. This computation is carried out by a state of the art two-view stereo method providing a non dense disparity estimation. We propose a disparity interpolation method increasing the density and improving the accuracy of this initial estimate. Disparities obtained from several pairs of views are fused to obtain a unique and robust estimation. Finally, different experiments on synthetic and real images show how the proposed method outperforms state of the art results.
Depth from Combining Defocus and Correspondence Using Light-Field Cameras
2013 IEEE International Conference on Computer Vision, 2013
Light-field cameras have recently become available to the consumer market. An array of micro-lenses captures enough information that one can refocus images after acquisition, as well as shift one's viewpoint within the subapertures of the main lens, effectively obtaining multiple views. Thus, depth cues from both defocus and correspondence are available simultaneously in a single capture. Previously, defocus could be achieved only through multiple image exposures focused at different depths, while correspondence cues needed multiple exposures at different viewpoints or multiple cameras; moreover, both cues could not easily be obtained together.
Accurate depth map estimation from a lenslet light field camera
2015
This paper introduces an algorithm that accurately estimates depth maps using a lenslet light field camera. The proposed algorithm estimates the multi-view stereo correspondences with sub-pixel accuracy using the cost volume. The foundation for constructing accurate costs is threefold. First, the sub-aperture images are displaced using the phase shift theorem. Second, the gradient costs are adaptively aggregated using the angular coordinates of the light field. Third, the feature correspondences between the sub-aperture images are used as additional constraints. With the cost volume, the multi-label optimization propagates and corrects the depth map in the weak texture regions. Finally, the local depth map is iteratively refined through fitting the local quadratic function to estimate a non-discrete depth map. Because micro-lens images contain unexpected distortions, a method is also proposed that corrects this error. The effectiveness of the proposed algorithm is demonstrated through challenging real world examples and including comparisons with the performance of advanced depth estimation algorithms.
Continuous Depth Map Reconstruction From Light Fields
— In this paper, we investigate how the recently emerged photography technology—the light field—can benefit depth map estimation, a challenging computer vision problem. A novel framework is proposed to reconstruct continuous depth maps from light field data. Unlike many traditional methods for the stereo matching problem, the proposed method does not need to quantize the depth range. By making use of the structure information amongst the densely sampled views in light field data, we can obtain dense and relatively reliable local estimations. Starting from initial estimations, we go on to propose an optimization method based on solving a sparse linear system iteratively with a conjugate gradient method. Two different affinity matrices for the linear system are employed to balance the efficiency and quality of the optimization. Then, a depth-assisted segmentation method is introduced so that different segments can employ different affinity matrices. Experiment results on both synthetic and real light fields demonstrate that our continuous results are more accurate, efficient, and able to preserve more details compared with discrete approaches.
Census-Based Cost on Gradients for Matching under Illumination Differences
2015 International Conference on 3D Vision, 2015
Stereo-matching is an indispensable process of dense 3D information extraction for a wide range of applications. Relevant methods rely on cost functions and optimization algorithms for estimating accurate disparities. This work analyses a novel cost for stereo matching under radiometric differences in the stereo-pair, which is based on a modification of the widely used census transformation. It is proposed to define the census on image x and y gradients. The modified census (MC) on gradients is evaluated as an independent matching cost in the presence of severe radiometric differences. For this, the original and the modified census transformation (CT) are implemented in three different aggregation schemes, namely fixed rectangular windows, adaptive cross-based support regions and semiglobal matching. It is shown that the MC can provide better results in the cases of local radiometric differences, such as different illumination conditions. Thus, this approach can extend the inherent capability of the original CT to address global monotonic radiometric differences.
Noise-Resilient Depth Estimation for Light Field Images Using Focal Stack and FFT Analysis
Sensors, 2022
Depth estimation for light field images is essential for applications such as light field image compression, reconstructing perspective views and 3D reconstruction. Previous depth map estimation approaches do not capture sharp transitions around object boundaries due to occlusions, making many of the current approaches unreliable at depth discontinuities. This is especially the case for light field images because the pixels do not exhibit photo-consistency in the presence of occlusions. In this paper, we propose an algorithm to estimate the depth map for light field images using depth from defocus. Our approach uses a small patch size of pixels in each focal stack image for comparing defocus cues, allowing the algorithm to generate sharper depth boundaries. Then, in contrast to existing approaches that use defocus cues for depth estimation, we use frequency domain analysis image similarity checking to generate the depth map. Processing in the frequency domain reduces the individual ...
Light field constancy within natural scenes
Applied Optics, 2007
The structure of light fields of natural scenes is highly complex due to high frequencies in the radiance distribution function. However it is the low-order properties of light that determine the appearance of common matte materials. We describe the local light field in terms of spherical harmonics and analyze the qualitative properties and physical meaning of the low-order components. We take a first step in the further development of Gershun's classical work on the light field by extending his description beyond the 3D vector field, toward a more complete description of the illumination using tensors. We show that the three first components, namely, the monopole (density of light), the dipole (light vector), and the quadrupole (squash tensor) suffice to describe a wide range of qualitatively different light fields.
Performance of phase-based algorithms for disparity estimation
Machine Vision and Applications, 1997
Stereoscopic depth analysis by means of disparity estimation has been a classical topic of computer vision, from the biological models of stereopsis [1] to the widely used techniques based on correlation or sum of squared differences [2]. Most of the recent work on this topic has been devoted to the phase-based techniques, developed because of their superior performance and better theoretical grounding [3, 4]. In this article we characterize the performance of phase-based disparity estimators, giving quantitative measures of their precision and their limits, and how changes in contrast, imbalance, and noise in the two stereo images modify the attainable accuracy. We find that the theoretical range of measurable disparities, one period of the modulation of the filter, is not attainable: the actual range is approx. two-thirds of this value. We show that the phase-based disparity estimators are robust to changes in contrast of 100% or more and well tolerate imbalances of luminosity of 400% between the images composing the stereo pair. Clearing the Gabor filter of its DC component has been often advocated as a means to improve the accuracy of the results. We give a quantitative measure of this improvement and show that using a DC-free Gabor filter leads to disparity estimators nearly insensitive to contrast and imbalance. Our tests show that the most critical source of error is noise: the error increases linearly with the increase in noise level. We conclude by studying the influence of the spectra and the luminosity of the input images on the error surface, for both artificial and natural images, showing that the spectral structure of the images has little influence on the results, changing only the form of the error surface near the limits of the detectable disparity range. In conclusion, this study allows estimation of the expected accuracy of custom-designed phase-based stereo analyzers for a combination of the most common error sources.
Depth Estimation using Light-Field Cameras
2014
Plenoptic cameras or light field cameras are a recent type of imaging devices that are starting to regain some popularity. These cameras are able to acquire the plenoptic function (4D light field) and, consequently, able to output the depth of a scene, by making use of the redundancy created by the multi-view geometry, where a single 3D point is imaged several times. Despite the attention given in the literature to standard plenoptic cameras, like Lytro, due to their simplicity and lower price, we did our work based on results obtained from a multi-focus plenoptic camera (Raytrix, in our case), due to their quality and higher resolution images. In this master thesis, we present an automatic method to estimate the virtual depth of a scene. Since the capture is done using a multi-focus plenoptic camera, we are working with multi-view geometry and lens with different focal lengths, and we can use that to back trace the rays in order to obtain the depth. We start by finding salient poin...