Neal Wadhwa (original) (raw)

About Me

I work on computer vision and machine learning problems, with a focus on applications in photography. My work at Google is the basis for Portrait Mode and Magic Eraser on the Google Pixel smartphone. Previously, I completed a Ph. D under the supervision of Bill Freeman at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. I worked on ways to visualize tiny changes in images and videos by amplifying them.

Generative AI Publications

	RealFill: Reference-Driven Generation for Authentic Image Completion Luming Tang, Nataniel Ruiz, Qinghao Chu, Yuanzhen Li, Aleksander Holynski, David E. Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, Michael Rubinstein arXiv, 2023. [Paper] [Website] [Two Minute Papers] Diffusion inpainting models hallucinate a plausible result when uncropping an image. In contrast, RealFill fills the image with what should have been there.
	HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, Kfir Aberman arXiv, 2023. [Paper] [Website] HyperDreamBooth is able to personalize a text-to-image diffusion model 25x faster than DreamBooth, by using a hyper network and fast finetuning.

RealFill: Reference-Driven Generation for Authentic Image Completion Luming Tang, Nataniel Ruiz, Qinghao Chu, Yuanzhen Li, Aleksander Holynski, David E. Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, Michael Rubinstein arXiv, 2023. [Paper] [Website] [Two Minute Papers] Diffusion inpainting models hallucinate a plausible result when uncropping an image. In contrast, RealFill fills the image with what should have been there.

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, Kfir Aberman arXiv, 2023. [Paper] [Website] HyperDreamBooth is able to personalize a text-to-image diffusion model 25x faster than DreamBooth, by using a hyper network and fast finetuning.

Mobile Photography Publications

	Defocus Map Estimation and Deblurring from a Single Dual-Pixel Image Shumian Xin, Neal Wadhwa, Tianfan Xue, Jonathan T. Barron, Pratul Srinivasan, Jiawen Chen, Ioannis Gkioulekas, Rahul Garg International Conference on Computer Vision (ICCV), 2021. [Paper] [Website] Multiplane images can be used to simultaneously deblur dual-pixel images, despite variable defocus due to depth variation in the scene.
	Du2Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels Yinda Zhang, Neal Wadhwa, Sergio Orts-Escolano, Christian Häne, Sean Fanello, Rahul Garg European Conference on Computer Vision (ECCV), 2020. [Paper] Using both dual-cameras and dual-pixels for depth estimation yields superior depth to using either alone.
	Learning to Autofocus Charles Herrmann, Richard Strong Bowen, Neal Wadhwa, Rahul Garg, Qiurui He, Jonathan T. Barron, Ramin Zabih Computer Vision and Pattern Recognition (CVPR), 2020. [Paper] [Website] A learning based approach to autofocus beats several learned and non-learned baselines.
	Learning Single Camera Depth Estimation using Dual-Pixels Rahul Garg, Neal Wadhwa, Sameer Ansari, Jonathan T. Barron International Conference on Computer Vision (ICCV), 2019. [Paper] [Github] Taking into account the optics of dual-pixel cameras can improve monocular depth estimation.
	Wireless Software Synchronization of Multiple Distributed Cameras Sameer Ansari, Neal Wadhwa, Rahul Garg, Jiawen Chen International Conference on Computational Photography (ICCP), 2019. [Paper] Smartphone cameras can be wirelessly synchronized to capture images within 250 microseconds of each other without extra hardware.
	Synthetic Depth-of-Field with a Single-Camera Mobile Phone Neal Wadhwa, Rahul Garg, David E. Jacobs, Bryan E. Feldman, Nori Kanazawa, Robert Carroll, Yair Movshovitz-Attias, Jonathan T. Barron, Yael Pritch, Marc Levoy ACM Transactions on Graphics, Volume 37, Number 4 (Proc. SIGGRAPH), 2018. [Paper] [Supplemental] [Blog Post] Depth from dual-pixels and semantic segmentation can be used to produce shallow depth-of-field images. Basis for Portrait Mode on Google Pixel 2.
	Aperture Supervision for Monocular Depth Estimation Pratul P. Srinivasan, Rahul Garg, Neal Wadhwa, Ren Ng, Jonathan T. Barron Proc. of Computer Vision and Pattern Recognition (CVPR), 2018. [Paper] Varying a camera's aperture provides a supervisory signal that can teach a neural network to do monocular depth estimation.
	Bilateral Guided Upsampling Jiawen Chen, Andrew Adams, Neal Wadhwa, Samuel W. Hasinoff ACM Transactions on Graphics, Volume 35, Number 6 (Proc. SIGGRAPH Asia), 2016. [Paper] [Code] [Video] Black-box image processing operators can be accelerated by fitting a fast bilateral-space model.

Motion Magnification and Analysis Publications

	Prospective Validation of Smartphone-based Heart Rate and Respiratory Rate Measurement Algorithms Sean Bae, Silviu Borac, Yunus Emre, Jonathan Wang, Jiang Wu, Mehr Kashyap, Si-Hyuck Kang, Liwen Chen, Melissa Moran, John Cannon, Eric S. Teasley, Allen Chai, Yun Liu, Neal Wadhwa, Mike Krainin, Michael Rubinstein, Alejandra Maciel, Michael V. McConnell, Shwetak Patel, Greg S. Corrado, James A. Taylor, Jiening Zhan, Ming Jack Po Nature Communications Medicine, 2022. [Paper] Smartphone camera-based techniques can accurately measure heart rate and respiration rate across groups with different skin tones and respiratory health status.
	Motion Microscopy for Visualizing and Quantifying Small Motions Neal Wadhwa, Justin G. Chen, Jonathan B. Sellon, Donglai Wei, Michael Rubinstein, Roozbeh Ghaffari, Dennis M. Freeman, Oral Buyukozturk, Pai Wang, Sijie Sun, Sung Hoon Kang, Katia Bertoldi, Frédo Durand, and William T. Freeman Proceedings of the National Academy of Sciences, October 2017. [Paper] [Webpage] [Code] Motion magnification and analysis reveal phenomena in three different scientific domains.
	Eulerian Video Magnification and Analysis Neal Wadhwa, Hao-Yu Wu, Abe Davis, Michael Rubinstein, Eugene Shih, Gautham J. Mysore, Justin G. Chen, Oral Buyukozturk, John V. Guttag, William T. Freeman, Frédo Durand Communications of the ACM, Volume 60, Number 1, January 2017. [Paper] An expository paper summarizing the various motion magnification and subtle motion analysis techniques we have developed.
	Modal Identification of Simple Structures with High-Speed Video using Motion Magnification Justin G. Chen, Neal Wadhwa, Young-Jin Cha, Frédo Durand, William T. Freeman, Oral Buyukozturk Journal of Sound and Vibration Volume 345, 2015. [Paper] Motion magnification can be used to identify the modal shapes of simple structures such as pipes and beams.
	Deviation Magnification: Revealing Departures from Ideal Geometries Neal Wadhwa, Tali Dekel, Donglai Wei, Frédo Durand, William T. Freeman ACM Transactions on Graphics, Volume 34, Number 6 (Proc. SIGGRAPH Asia), 2015. [Paper] [Webpage] A method to reveal subtle deviations of objects from ideal geometries, like lines or cirlces, in a single image by magnifying them.
	Refraction Wiggles for Measuring Fluid Depth and Velocity from Video Tianfan Xue, Michael Rubinstein, Neal Wadhwa, Anat Levin, Frédo Durand, William T. Freeman Proc. of European Conference on Computer Vision (ECCV), 2014. [Paper] [Webpage] A method to recover the velocity and depth of hot air from video.
	The Visual Microphone: Passive Recovery of Sound from Video Abe Davis, Michael Rubinstein, Neal Wadhwa, Gautham Mysore, Frédo Durand, William T. Freeman ACM Transactions on Graphics, Volume 33, Number 4 (Proc. SIGGRAPH), 2014. [Paper] [Webpage] A technique to recover sound from videos of objects subtly vibrating in response to sound.
	Structural Modal Identification through High Speed Camera Video: Motion Magnification Justin G. Chen, Neal Wadhwa, Young-Jin Cha, Frédo Durand, William T. Freeman, Oral Buyukozturk Proceedings of the 32nd International Modal Analysis Conference (2014) [Paper] A validation that the motion magnified motions are indeed real and a way to compute the mode shapes of a cantilevered beam from video.
	Riesz Pyramids for Fast Phase-Based Video Magnification Neal Wadhwa, Michael Rubinstein, Frédo Durand, William T. Freeman Computational Photography (ICCP), 2014 IEEE International Conference on [Paper] [Webpage] [Pseudocode] Provides the quality of the previous phase-based technique with the real-time speed of the original linear technique.
	Phase-based Video Motion Processing Neal Wadhwa, Michael Rubinstein, Frédo Durand, William T. Freeman ACM Transactions on Graphics, Volume 32, Number 4 (Proc. SIGGRAPH), 2013. [Paper] [Webpage] [BibTeX] A new technique to amplify small motions that solves the noise amplification and intensity clipping artifacts of the previous linear method by manipulating the phase in sub-bands of videos.
	Revealing Invisible Changes In The World Michael Rubinstein, Neal Wadhwa, Frédo Durand, William T. Freeman Science Vol. 339 No. 6119, Feb 1 2013 [Article in Science] [Video] [NSF SciVis 2012] [BibTeX] An expository video showcasing our results and explaining our technique.