Vincent Casser – Machine Learning Researcher, Software Engineer (original) (raw)

I’m a Staff Research Scientist at Waymo, the autonomous driving company formerly known as the Google Self-Driving Car Project. At Waymo, I’m developing new technologies for autonomous vehicles in areas such as reconstructive sensor simulation (NeRF/3DGS), generative sensor simulation, sensor fusion, multi-task learning and foundation models. I have deployed numerous safety-critical models to Waymo’s fully autonomous vehicle fleet, which has served millions of trips to customers across various markets.A subset of my research is published at CVPR, ICCV, CoRL, IROS and ICRA, and I hold numerous international patents in the autonomous driving domain. I have been organizing the AV industry’s primary academic workshop at CVPR in 2022, 2023, 2024 and 2025.

I enjoy interdisciplinary work, and have broad experience in machine learning, deep learning and computer vision. Before joining Waymo, I pursued research in domains such as computational perception, aerial robotics and biomedical imaging. Some of my previous projects were related to the study of human memory (at MIT), machine learning applications in healthcare (with Massachusetts General Hospital), astronomy (with the Harvard-Smithsonian Center) and electron microscopy (with the Harvard Lichtman Lab).

News

02/26/2025	New paper at CVPR’25: “SceneCrafter: Controllable Multi-View Driving Scene Editing”
01/01/2025	I’m organizing the Workshop on Autonomous Driving at CVPR’25 in Nashville, TN
01/29/2024	New paper at ICRA’24: “LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection”
01/01/2024	I’m organizing the Workshop on Autonomous Driving at CVPR’24 in Seattle, WA
06/29/2023	Recordings of the CVPR WAD 2023 workshop are available now.
01/01/2023	I’m organizing the Workshop on Autonomous Driving at CVPR’23 in Vancouver, Canada
06/20/2022	New paper at IROS’22: “Instance Segmentation with Cross-Modal Consistency”
06/20/2022	Organized the Workshop on Autonomous Driving at CVPR’22
06/14/2022	Our Block-NeRF dataset is now available.
03/01/2022	New paper at CVPR’22: “Block-NeRF: Scalable Large Scene Neural View Synthesis” (oral presentation)
01/16/2022	New preprint: “GradTail: Learning Long-Tailed Data Using Gradient-based Sample Weighting”
07/22/2021	New paper at ICCV’21: “4D-Net for Learned Multi-Modal Alignment”
03/01/2021	New paper at CVPR’21: “Taskology: Utilizing Task Relations at Scale” (oral presentation)
10/14/2020	New paper at CoRL’20: “Unsupervised Monocular Depth Learning in Dynamic Scenes”
07/02/2020	New paper at ECCV’20: “Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability”
07/01/2020	New paper at UIST’20: “Predicting Visual Importance Across Graphic Design Types“
04/10/2020	New paper at Medical Imaging with Deep Learning (MIDL’20): “Fast Mitochondria Segmentation For Connectomics”
02/10/2020	I am co-organizing the 4D-VISION workshop at ECCV’20
01/22/2020	I co-organized the ComputeFest Transfer Learning workshop at Harvard
10/02/2019	New paper at SVRHM, NeurIPS’19: “To Decay or not to Decay: Modeling Video Memorability Over Time”
08/19/2019	I’m joining Waymo as a Research Scientist
05/30/2019	Graduated from Harvard University with a Master’s degree in Computational Science and Engineering
04/30/2019	New paper at Robotics: Science and Systems (RSS’19): “OIL: Observational Imitation Learning”
04/16/2019	New paper at VOCVALC, CVPR’19: “Unsupervised Monocular Depth and Ego-motion Learning with Structure and Semantics”
04/06/2019	New paper at UAVISION, CVPR’19: “Learning a Controller Fusion Network by Online Trajectory Filtering”
01/23/2019	Gave a workshop on “Convolutional Autoencoders for Image Manipulation” at ComputeFest 2019
11/28/2018	New project released: OIL: Observational Imitation Learning
11/27/2018	New blog post on our struct2depth work on Google’s AI blog
11/19/2018	The code for our struct2depth paper is now part of the TensorFlow models repository
11/01/2018	New paper at AAAI’19: “Depth Prediction Without The Sensors: Leveraging Structure For Unsupervised Learning From Monocular Videos”
10/06/2018	Joined the MIT Computational Perception & Cognition Lab, led by Aude Olivia
09/08/2018	We won the best paper award at UAVISION 2018
08/03/2018	We are presenting our work on autonomous drone racing on Sept 8 at UAVISION, ECCV’18
05/29/2018	Started internship in the Google Brain Robotics group
05/22/2018	Our new datasets for connectomics research are now publicly available: Kasthuri++ and Lucchi++
05/21/2018	Release of new tutorial for Bayesian GAN
02/08/2018	Started new project on Connectomics with the Visual Computing Group (VCG)
01/22/2018	Started new collaboration with the Center for Clinical Data Science (CCDS)
11/23/2017	New paper in IJCV: “Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications” (full text)
10/21/2017	Official release of Sim4CV, our simulation environment for Computer Vision
09/01/2017	Started Master’s program in Computational Science and Engineering
08/19/2017	New paper at UAVision, ECCV’18: “Teaching UAVs to Race: End-to-End Regression of Agile Controls in Simulation“
05/24/2017	Recipient of German Academic Scholarship Foundation US-Scholarship (Studienstiftung)
03/28/2017	Recipient of DAAD Graduate Scholarship

Publications

2025-03-04 03:37:21

SceneCrafter: Controllable Multi-View Driving Scene Editing

Zehao Zhu, Yuliang Zou, Chiyu “Max” Jiang, Bo Sun, Vincent Casser, Xiukun Huang, Jiahao Wang, Zhenpei Yang, Ruiqie Gao, Leonidas Guibas, Mingxing Tan, Dragomir Anguelov
Conference on Computer Vision and Pattern Recognition (CVPR’25).

2025-03-04 03:42:44

LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection

Wayne Hung, Vincent Casser, Henrik Kretzschmar, Jyh-Jing Hwang, Dragomir Anguelov:
IEEE International Conference on Robotics and Automation (ICRA’24). Full link

2025-03-04 03:45:47

Block-NeRF: Scalable Neural Rendering

Matthew Tancik, Vincent Casser, Xinchen Yan, Sabeek Pradhan, Ben Mildenhall, Pratul P. Srinivasan, Jon T. Barron, Henrik Kretzschmar
Conference on Computer Vision and Pattern Recognition (CVPR’22). Oral presentation. Full link

2025-03-04 03:47:28

Alex Zhu, Vincent Casser, Reza Mahjourian, Henrik Kretzschmar and Soeren Pirk
International Conference on Intelligent Robots and Systems (IROS’22) Full link

2025-03-04 03:49:11

AJ Piergiovanni, Vincent Casser, Michael Ryoo and Anelia Angelova
International Conference on Computer Vision (ICCV’21) Full link

2025-03-04 03:52:05

Taskology: Utilizing Task Relations at Scale

Yao Lu, Soeren Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad, Zhao Chen, Vincent Casser, Anelia Angelova and Ariel Gordon
Conference on Computer Vision and Pattern Recognition (CVPR’21). Oral presentation. Full link

2025-03-04 03:53:20

Unsupervised Monocular Depth Learning in Dynamic Scenes

Hanhan Li, Ariel Gordon, Hang Zhao, Vincent Casser, Anelia Angelova
Conference on Robot Learning (CoRL’20) Full link

2025-03-04 03:54:38

Multimodal Memorability: Modeling Effects of Semantics and Decay on Video Memorability

Camilo Fosco, Anelise Newman, Vincent Casser, Allen Lee, Barry McNamara and Aude Oliva
European Conference on Computer Vision (ECCV’20) Full link

2025-03-04 03:56:41

Predicting Visual Importance Across Graphic Design Types

Camilo Fosco, Vincent Casser, Amish K. Bedi, Peter O’Donovan, Aaron Hertzmann and Zoya Bylinskii
ACM User Interface Software and Technology Symposium (UIST’20) Full link

2025-03-04 03:58:39

Fast Mitochondria Segmentation for Connectomics

Vincent Casser, Kai Kang, Hanspeter Pfister and Daniel Haehn
Medical Imaging with Deep Learning (MIDL’20) Full link

2025-03-04 03:59:26

Depth Prediction Without the Sensors

Vincent Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
Thirty-Third AAAI Conference on Artificial Intelligence (AAAI’19) Full link

2025-03-04 04:02:54

101

OIL: Observational Imitation Learning

Guohao Li and Matthias Mueller, Vincent Casser, Neil Smith, Dominik Michels, Bernard Ghanem
Robotics: Science and Systems (RSS’19) Full link

2025-03-04 04:01:53

100

Sim4CV: A Photo-Realistic Simulator for Computer Vision

Matthias Mueller, Vincent Casser, Jean Lahoud, Neil Smith, Bernard Ghanem
International Journal of Computer Vision (IJCV) Full link