Akshat Dave - Academia.edu (original) (raw)
Papers by Akshat Dave
ACM SIGGRAPH 2023 Courses
arXiv (Cornell University), Dec 8, 2022
Figure 1. Objects as radiance-field cameras. We convert everyday objects with unknown geometry (a... more Figure 1. Objects as radiance-field cameras. We convert everyday objects with unknown geometry (a) into radiance-field cameras by modeling multi-view reflections (b) as projections of the 5D radiance field of the environment. We convert the object surface into a virtual sensor to capture this radiance field (c), which enables depth and radiance estimation of the surrounding environment. We can then query this radiance field to perform beyond field-of-view novel view synthesis of the environment (d).
2022 IEEE International Symposium on Circuits and Systems (ISCAS)
Optics Express
We present a polarization-based approach to perform diffuse-specular separation from a single pol... more We present a polarization-based approach to perform diffuse-specular separation from a single polarimetric image, acquired using a flexible, practical capture setup. Our key technical insight is that, unlike previous polarization-based separation methods that assume completely unpolarized diffuse reflectance, we use a more general polarimetric model that accounts for partially polarized diffuse reflections. We capture the scene with a polarimetric sensor and produce an initial analytical diffuse-specular separation that we further pass into a deep network trained to refine the separation. We demonstrate that our combination of analytical separation and deep network refinement produces state-of-the-art diffuse-specular separation, which enables image-based appearance editing of dynamic scenes and enhanced appearance estimation.
2022 IEEE International Conference on Computational Photography (ICCP)
2017 IEEE International Conference on Image Processing (ICIP), 2017
We propose to use a deep generative model, RIDE [1] , as an image prior for compressive signal re... more We propose to use a deep generative model, RIDE [1] , as an image prior for compressive signal recovery. Since RIDE models long-range dependency in images using spatial LSTM, image recovery is better than other competing methods.
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2021
We introduce DeepIR, a new thermal image processing framework that combines physically accurate s... more We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (that is well-represented using a deep networkbased regularizer). DeepIR requires neither training data nor periodic ground-truth calibration with a known black body target-making it well suited for practical computer vision tasks. We demonstrate the power of going DeepIR by developing new denoising and super-resolution algorithms that exploit multiple images of the scene captured with camera jitter. Simulated and real data experiments demonstrate that DeepIR can perform high-quality non-uniformity correction with as few as three images, achieving a 10dB PSNR improvement over competing approaches.
IITMSAT is a student-built nano satellite mission of Indian Institute of Technology Madras, Chenn... more IITMSAT is a student-built nano satellite mission of Indian Institute of Technology Madras, Chennai, India. The objective is to study the precipitation of high energy electrons and protons from Van-Allen radiation belts to lower altitude of 600-900 km due to resonance interaction with low frequency EM waves. The unique communications system design of IITMSAT evolves from the challenging downlink data requirement of 1 MB per day in the UHF band posed by the mission and the satellite's payload, SPEED (Space based Proton and Electron Energy Detector). To ensure continuous downlink data stream in the short Low earth Orbit passes, a robust physical layer protocol was designed to counter time-varying aspects of a Space-Earth telecom link. For the on-board communications system, two types of design alternatives exist for each module. The first option is a custom design wherein a module is developed from scratch using discrete components.The other option is an integrated design wherein ...
2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019
Non-line-of-sight (NLOS) imaging aims to reconstruct scenes outside the field of view of an imagi... more Non-line-of-sight (NLOS) imaging aims to reconstruct scenes outside the field of view of an imaging system. A common approach is to measure the so-called light transients, which facilitates reconstructions through ellipsoidal tomography that involves solving a linear least-squares. Unfortunately, the corresponding linear operator is very high-dimensional and lacks structures that facilitate fast solvers, and so, the ensuing optimization is a computationally daunting task. We introduce a computationally tractable framework for solving the ellipsoidal tomography problem. Our main observation is that the Gram of the ellipsoidal tomography operator is convolutional, either exactly under certain idealized imaging conditions, or approximately in practice. This, in turn, allows us to obtain the ellipsoidal tomography solution by using efficient deconvolution procedures to solve a linear least-squares problem involving the Gram operator. The computational tractability of our approach also facilitates the use of various regularizers during the deconvolution procedure. We demonstrate the advantages of our framework in a variety of simulated and real experiments.
2019 IEEE International Conference on Computational Photography (ICCP), 2019
Over the last decade, several techniques have been developed for looking around the corner by exp... more Over the last decade, several techniques have been developed for looking around the corner by exploiting the round-trip travel time of photons. Typically, these techniques necessitate the collection of a large number of measurements with varying virtual source and virtual detector locations. This data is then processed by a reconstruction algorithm to estimate the hidden scene. As a consequence, even when the region of interest in the hidden volume is small and limited, the acquisition time needed is large as the entire dataset has to be acquired and then processed.In this paper, we present the first example of scanning based non-line-of-sight imaging technique. The key idea is that if the virtual sources (pulsed sources) on the wall are delayed using a quadratic delay profile (much like the quadratic phase of a focusing lens), then these pulses arrive at the same instant at a single point in the hidden volume – the point being scanned. On the imaging side, applying quadratic delays to the virtual detectors before integration on a single gated detector allows us to ‘focus’ and scan each point in the hidden volume. By changing the quadratic delay profiles, we can focus light at different points in the hidden volume. This provides the first example of scanning based non-line-of-sight imaging, allowing us to focus our measurements only in the region of interest. We derive the theoretical underpinnings of ‘temporal focusing’, show compelling simulations of performance analysis, build a hardware prototype system and demonstrate real results.
IEEE Transactions on Computational Imaging, 2018
Signal reconstruction is a challenging aspect of computational imaging as it often involves solvi... more Signal reconstruction is a challenging aspect of computational imaging as it often involves solving ill-posed inverse problems. Recently, deep feed-forward neural networks have led to state-of-the-art results in solving various inverse imaging problems. However, being task specific, these networks have to be learned for each inverse problem. On the other hand, a more flexible approach would be to learn a deep generative model once and then use it as a signal prior for solving various inverse problems. We show that among the various state of the art deep generative models, autoregressive models are especially suitable for our purpose for the following reasons. First, they explicitly model the pixel level dependencies and hence are capable of reconstructing low-level details such as texture patterns and edges better. Second, they provide an explicit expression for the image prior which can then be used for MAP based inference along with the forward model. Third, they can model long range dependencies in images which make them ideal for handling global multiplexing as encountered in various compressive imaging systems. We demonstrate the efficacy of our proposed approach in solving three computational imaging problems: Single Pixel Camera (SPC), LiSens and FlatCam. For both real and simulated cases, we obtain better reconstructions than the state-of-the-art methods in terms of perceptual and quantitative metrics.
IEEE Transactions on Computational Imaging, 2018
Reconstructing an object's geometry and appearance from multiple images, also known as inverse re... more Reconstructing an object's geometry and appearance from multiple images, also known as inverse rendering, is a fundamental problem in computer graphics and vision. Inverse rendering is inherently ill-posed because the captured image is an intricate function of unknown lighting conditions, material properties and scene geometry. Recent progress in representing scene properties as coordinate-based neural networks have facilitated neural inverse rendering resulting in impressive geometry reconstruction and novel-view synthesis. Our key insight is that polarization is a useful cue for neural inverse rendering as polarization strongly depends on surface normals and is distinct for diffuse and specular reflectance. With the advent of commodity, on-chip, polarization sensors, capturing polarization has become practical. Thus, we propose PANDORA, a polarimetric inverse rendering approach based on implicit neural representations. From multi-view polarization images of an object, PANDORA jointly extracts the object's 3D geometry, separates the outgoing radiance into diffuse and specular and estimates the illumination incident on the object. We show that PANDORA outperforms state-of-the-art radiance decomposition techniques. PANDORA outputs clean surface reconstructions free from texture artefacts, models strong specularities accurately and estimates illumination under practical unstructured scenarios.
Existing non-line-of-sight imaging techniques suffer from a tradeoff between field of view and sp... more Existing non-line-of-sight imaging techniques suffer from a tradeoff between field of view and spatial resolution. We propose an imaging system that tackles this tradeoff by efficiently combining information from transient imaging and correlography subsystems.
ACM SIGGRAPH 2023 Courses
arXiv (Cornell University), Dec 8, 2022
Figure 1. Objects as radiance-field cameras. We convert everyday objects with unknown geometry (a... more Figure 1. Objects as radiance-field cameras. We convert everyday objects with unknown geometry (a) into radiance-field cameras by modeling multi-view reflections (b) as projections of the 5D radiance field of the environment. We convert the object surface into a virtual sensor to capture this radiance field (c), which enables depth and radiance estimation of the surrounding environment. We can then query this radiance field to perform beyond field-of-view novel view synthesis of the environment (d).
2022 IEEE International Symposium on Circuits and Systems (ISCAS)
Optics Express
We present a polarization-based approach to perform diffuse-specular separation from a single pol... more We present a polarization-based approach to perform diffuse-specular separation from a single polarimetric image, acquired using a flexible, practical capture setup. Our key technical insight is that, unlike previous polarization-based separation methods that assume completely unpolarized diffuse reflectance, we use a more general polarimetric model that accounts for partially polarized diffuse reflections. We capture the scene with a polarimetric sensor and produce an initial analytical diffuse-specular separation that we further pass into a deep network trained to refine the separation. We demonstrate that our combination of analytical separation and deep network refinement produces state-of-the-art diffuse-specular separation, which enables image-based appearance editing of dynamic scenes and enhanced appearance estimation.
2022 IEEE International Conference on Computational Photography (ICCP)
2017 IEEE International Conference on Image Processing (ICIP), 2017
We propose to use a deep generative model, RIDE [1] , as an image prior for compressive signal re... more We propose to use a deep generative model, RIDE [1] , as an image prior for compressive signal recovery. Since RIDE models long-range dependency in images using spatial LSTM, image recovery is better than other competing methods.
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2021
We introduce DeepIR, a new thermal image processing framework that combines physically accurate s... more We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (that is well-represented using a deep networkbased regularizer). DeepIR requires neither training data nor periodic ground-truth calibration with a known black body target-making it well suited for practical computer vision tasks. We demonstrate the power of going DeepIR by developing new denoising and super-resolution algorithms that exploit multiple images of the scene captured with camera jitter. Simulated and real data experiments demonstrate that DeepIR can perform high-quality non-uniformity correction with as few as three images, achieving a 10dB PSNR improvement over competing approaches.
IITMSAT is a student-built nano satellite mission of Indian Institute of Technology Madras, Chenn... more IITMSAT is a student-built nano satellite mission of Indian Institute of Technology Madras, Chennai, India. The objective is to study the precipitation of high energy electrons and protons from Van-Allen radiation belts to lower altitude of 600-900 km due to resonance interaction with low frequency EM waves. The unique communications system design of IITMSAT evolves from the challenging downlink data requirement of 1 MB per day in the UHF band posed by the mission and the satellite's payload, SPEED (Space based Proton and Electron Energy Detector). To ensure continuous downlink data stream in the short Low earth Orbit passes, a robust physical layer protocol was designed to counter time-varying aspects of a Space-Earth telecom link. For the on-board communications system, two types of design alternatives exist for each module. The first option is a custom design wherein a module is developed from scratch using discrete components.The other option is an integrated design wherein ...
2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019
Non-line-of-sight (NLOS) imaging aims to reconstruct scenes outside the field of view of an imagi... more Non-line-of-sight (NLOS) imaging aims to reconstruct scenes outside the field of view of an imaging system. A common approach is to measure the so-called light transients, which facilitates reconstructions through ellipsoidal tomography that involves solving a linear least-squares. Unfortunately, the corresponding linear operator is very high-dimensional and lacks structures that facilitate fast solvers, and so, the ensuing optimization is a computationally daunting task. We introduce a computationally tractable framework for solving the ellipsoidal tomography problem. Our main observation is that the Gram of the ellipsoidal tomography operator is convolutional, either exactly under certain idealized imaging conditions, or approximately in practice. This, in turn, allows us to obtain the ellipsoidal tomography solution by using efficient deconvolution procedures to solve a linear least-squares problem involving the Gram operator. The computational tractability of our approach also facilitates the use of various regularizers during the deconvolution procedure. We demonstrate the advantages of our framework in a variety of simulated and real experiments.
2019 IEEE International Conference on Computational Photography (ICCP), 2019
Over the last decade, several techniques have been developed for looking around the corner by exp... more Over the last decade, several techniques have been developed for looking around the corner by exploiting the round-trip travel time of photons. Typically, these techniques necessitate the collection of a large number of measurements with varying virtual source and virtual detector locations. This data is then processed by a reconstruction algorithm to estimate the hidden scene. As a consequence, even when the region of interest in the hidden volume is small and limited, the acquisition time needed is large as the entire dataset has to be acquired and then processed.In this paper, we present the first example of scanning based non-line-of-sight imaging technique. The key idea is that if the virtual sources (pulsed sources) on the wall are delayed using a quadratic delay profile (much like the quadratic phase of a focusing lens), then these pulses arrive at the same instant at a single point in the hidden volume – the point being scanned. On the imaging side, applying quadratic delays to the virtual detectors before integration on a single gated detector allows us to ‘focus’ and scan each point in the hidden volume. By changing the quadratic delay profiles, we can focus light at different points in the hidden volume. This provides the first example of scanning based non-line-of-sight imaging, allowing us to focus our measurements only in the region of interest. We derive the theoretical underpinnings of ‘temporal focusing’, show compelling simulations of performance analysis, build a hardware prototype system and demonstrate real results.
IEEE Transactions on Computational Imaging, 2018
Signal reconstruction is a challenging aspect of computational imaging as it often involves solvi... more Signal reconstruction is a challenging aspect of computational imaging as it often involves solving ill-posed inverse problems. Recently, deep feed-forward neural networks have led to state-of-the-art results in solving various inverse imaging problems. However, being task specific, these networks have to be learned for each inverse problem. On the other hand, a more flexible approach would be to learn a deep generative model once and then use it as a signal prior for solving various inverse problems. We show that among the various state of the art deep generative models, autoregressive models are especially suitable for our purpose for the following reasons. First, they explicitly model the pixel level dependencies and hence are capable of reconstructing low-level details such as texture patterns and edges better. Second, they provide an explicit expression for the image prior which can then be used for MAP based inference along with the forward model. Third, they can model long range dependencies in images which make them ideal for handling global multiplexing as encountered in various compressive imaging systems. We demonstrate the efficacy of our proposed approach in solving three computational imaging problems: Single Pixel Camera (SPC), LiSens and FlatCam. For both real and simulated cases, we obtain better reconstructions than the state-of-the-art methods in terms of perceptual and quantitative metrics.
IEEE Transactions on Computational Imaging, 2018
Reconstructing an object's geometry and appearance from multiple images, also known as inverse re... more Reconstructing an object's geometry and appearance from multiple images, also known as inverse rendering, is a fundamental problem in computer graphics and vision. Inverse rendering is inherently ill-posed because the captured image is an intricate function of unknown lighting conditions, material properties and scene geometry. Recent progress in representing scene properties as coordinate-based neural networks have facilitated neural inverse rendering resulting in impressive geometry reconstruction and novel-view synthesis. Our key insight is that polarization is a useful cue for neural inverse rendering as polarization strongly depends on surface normals and is distinct for diffuse and specular reflectance. With the advent of commodity, on-chip, polarization sensors, capturing polarization has become practical. Thus, we propose PANDORA, a polarimetric inverse rendering approach based on implicit neural representations. From multi-view polarization images of an object, PANDORA jointly extracts the object's 3D geometry, separates the outgoing radiance into diffuse and specular and estimates the illumination incident on the object. We show that PANDORA outperforms state-of-the-art radiance decomposition techniques. PANDORA outputs clean surface reconstructions free from texture artefacts, models strong specularities accurately and estimates illumination under practical unstructured scenarios.
Existing non-line-of-sight imaging techniques suffer from a tradeoff between field of view and sp... more Existing non-line-of-sight imaging techniques suffer from a tradeoff between field of view and spatial resolution. We propose an imaging system that tackles this tradeoff by efficiently combining information from transient imaging and correlography subsystems.