Material and Lighting Reconstruction for Complex Indoor Scenes with Texture-space Differentiable Rendering (original) (raw)

Deep scene-scale material estimation from multi-view indoor captures

Computers & Graphics

The movie and video game industries have adopted photogrammetry as a way to create digital 3D assets from multiple photographs of a real-world scene. But photogrammetry algorithms typically output an RGB texture atlas of the scene that only serves as visual guidance for skilled artists to create material maps suitable for physically-based rendering. We present a learning-based approach that automatically produces digital assets ready for physically-based rendering, by estimating approximate material maps from multi-view captures of indoor scenes that are used with retopologized geometry. We base our approach on a material estimation Convolutional Neural Network (CNN) that we execute on each input image. We leverage the view-dependent visual cues provided by the multiple observations of the scene by gathering, for each pixel of a given image, the color of the corresponding point in other images. This image-space CNN provides us with an ensemble of predictions, which we merge in texture space as the last step of our approach. Our results demonstrate that the recovered assets can be directly used for physically-based rendering and editing of real indoor scenes from any viewpoint and novel lighting. Our method generates approximate material maps in a fraction of time compared to the closest previous solutions.

Towards Scalable Multi-View Reconstruction of Geometry and Materials

arXiv (Cornell University), 2023

In this paper, we propose a novel method for joint recovery of camera pose, object geometry and spatially-varying Bidirectional Reflectance Distribution Function (svBRDF) of 3D scenes that exceed object-scale and hence cannot be captured with stationary light stages. The input are high-resolution RGB-D images captured by a mobile, hand-held capture system with point lights for active illumination. Compared to previous works that jointly estimate geometry and materials from a hand-held scanner, we formulate this problem using a single objective function that can be minimized using off-the-shelf gradient-based solvers. To facilitate scalability to large numbers of observation views and optimization variables, we introduce a distributed optimization algorithm that reconstructs 2.5D keyframe-based representations of the scene. A novel multi-view consistency regularizer effectively synchronizes neighboring keyframes such that the local optimization results allow for seamless integration into a globally consistent 3D model. We provide a study on the importance of each component in our formulation and show that our method compares favorably to baselines. We further demonstrate that our method accurately reconstructs various objects and materials and allows for expansion to spatially larger scenes. We believe that this work represents a significant step towards making geometry and material estimation from hand-held scanners scalable.

3D Scene Creation and Rendering via Rough Meshes: A Lighting Transfer Avenue

arXiv (Cornell University), 2022

This paper studies how to flexibly integrate reconstructed 3D models into practical 3D modeling pipelines such as 3D scene creation and rendering. Due to the technical difficulty, one can only obtain rough 3D models (R3DMs) for most real objects using existing 3D reconstruction techniques. As a result, physically-based rendering (PBR) would render low-quality images or videos for scenes that are constructed by R3DMs. One promising solution would be representing real-world objects as Neural Fields such as NeRFs, which are able to generate photo-realistic renderings of an object under desired viewpoints. However, a drawback is that the synthesized views through Neural Fields Rendering (NFR) cannot reflect the simulated lighting details on R3DMs in PBR pipelines, especially when object interactions in the 3D scene creation cause local shadows. To solve this dilemma, we propose a lighting transfer network (LighTNet) to bridge NFR and PBR, such that they can benefit from each other. LighTNet reasons about a simplified image composition model, remedies the uneven surface issue caused by R3DMs, and is empowered by several perceptual-motivated constraints and a new Lab angle loss which enhances the contrast between lighting strength and colors. Comparisons demonstrate that LighT-Net is superior in synthesizing impressive lighting, and is promising in pushing NFR further in practical 3D modeling workflows. Project page: https://3d-frontfuture.github.io/LighTNet.

Joint Texture and Geometry Optimization for RGB-D Reconstruction

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Due to inevitable noises and quantization error, the reconstructed 3D models via RGB-D sensors always accompany geometric error and camera drifting, which consequently lead to blurring and unnatural texture mapping results. Most of the 3D reconstruction methods focus on either geometry refinement or texture improvement respectively, which subjectively decouples the interrelationship between geometry and texture. In this paper, we propose a novel approach that can jointly optimize the camera poses, texture and geometry of the reconstructed model, and color consistency between the key-frames. Instead of computing Shape-From-Shading (SFS) expensively, our method directly optimizes the reconstructed mesh according to color and geometric consistency and high-boost normal cues, which can effectively overcome the texture-copy problem generated by SFS and achieve more detailed shape reconstruction. As the joint optimization involves multiple correlated terms, therefore, we further introduce an iterative framework to interleave the optimal state. The experiments demonstrate that our method can recover not only fine-scale geometry but also high-fidelity texture.

SIR: Multi-view Inverse Rendering with Decomposable Shadow for Indoor Scenes

arXiv (Cornell University), 2024

We propose SIR, an efficient method to decompose differentiable shadows for inverse rendering on indoor scenes using multi-view data, addressing the challenges in accurately decomposing the materials and lighting conditions. Unlike previous methods that struggle with shadow fidelity in complex lighting environments, our approach explicitly learns shadows for enhanced realism in material estimation under unknown light positions. Utilizing posed HDR images as input, SIR employs an SDF-based neural radiance field for comprehensive scene representation. Then, SIR integrates a shadow term with a three-stage material estimation approach to improve SVBRDF quality. Specifically, SIR is designed to learn a differentiable shadow, complemented by BRDF regularization, to optimize inverse rendering accuracy. Extensive experiments on both synthetic and real-world indoor scenes demonstrate the superior performance of SIR over existing methods in both quantitative metrics and qualitative analysis. The significant decomposing ability of SIR enables sophisticated editing capabilities like free-view relighting, object insertion, and material replacement. The code and data are available at https://xiaokangwei.github.io/SIR/.

Inverse Path Tracing for Joint Material and Lighting Estimation

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Figure 1: Our Inverse Path Tracing algorithm takes as input a 3D scene and up to several RGB images (left), and estimates material as well as the lighting parameters of the scene. The main contribution of our approach is the formulation of an end-to-end differentiable inverse Monte Carlo renderer which is utilized in a nested stochastic gradient descent optimization.

Fast environment extraction for lighting and occlusion of virtual objects in real scenes

… (MMSP), 2010 IEEE …, 2010

Augmented reality aims to insert virtual objects in real scenes. In order to obtain a coherent and realistic integration, these objects have to be relighted according to their positions and real light conditions. They also have to deal with occlusion by nearest parts of the real scene. To achieve this, we have to extract photometry and geometry from the real scene. In this paper, we adapt high dynamic range reconstruction and depth estimation methods to deal with real-time constraint and consumer devices. We present their limitations along with significant parameters influencing computing time and image quality. We tune these parameters to accelerate computation and evaluate their impact on the resulting quality. To fit with the augmented reality context, we propose a real-time extraction of these information from video streams, in a single pass.

Inverse global illumination: Recovering reflectance models of real scenes from photographs

1999

ABSTRACT In this paper we present a method for recovering the reflectance properties of all surfaces in a real scene from a sparse set of photographs, taking into account both direct and indirect illumination. The result is a lighting-independent model of the scene's geometry and reflectance properties, which can be rendered with arbitrary modifications to structure and lighting via traditional rendering methods.

Fast Extraction of BRDFs and Material Maps from Images

Graphics Interface, 2003

Modeling complex realistic objects is a difficult and time consuming process. Nevertheless, with improvements in rendering speed and quality, more and more applications require such realistic complex 3D objects. We present an interactive modeling system that extracts 3D objects from photographs. Our key contribution lies in the tight integration of a point-based representation and user interactivity, by introducing a set of interactive tools to guide reconstruction. 3D color points are a flexible and effective representation for very complex objects; adding, moving, or removing points is fast and simple, facilitating easy improvement of object quality. Because images and depths maps can be very rapidly generated from points, testing validity of point projections in several images is efficient and simple. These properties allow our system to rapidly generate a first approximate model, and allow the user to continuously and interactively guide the generation of points, both locally an...

Fast Spatially-Varying Indoor Lighting Estimation

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019

We propose a real-time method to estimate spatiallyvarying indoor lighting from a single RGB image. Given an image and a 2D location in that image, our CNN estimates a 5th order spherical harmonic representation of the lighting at the given location in less than 20ms on a laptop mobile graphics card. While existing approaches estimate a single, global lighting representation or require depth as input, our method reasons about local lighting without requiring any geometry information. We demonstrate, through quantitative experiments including a user study, that our results achieve lower lighting estimation errors and are preferred by users over the state-of-the-art. Our approach can be used directly for augmented reality applications, where a virtual object is relit realistically at any position in the scene in real-time. * Parts of this work were completed while Mathieu Garon was an intern at Adobe Research.

Material and Lighting Reconstruction for Complex Indoor Scenes with Texture-space Differentiable Rendering (original) (raw)

Related papers