Yi Wei (original) (raw)

Yi Wei I am a research engineer in Huawei, working on 3D vision and computer graphics. I obtained my Ph.D degree at the Intelligent Vision Group (IVG), Department of Automation, Tsinghua University, advised by Prof. Jiwen Lu. My research interests lie in 3D vision, especially focusing on 3D scene understanding and 3D reconstruction. I hope my research can help the industry applications. Prior to that, I received my Bachelor's degree from the department of Electronic Engineering, Tsinghua University in 2019 (Ranking 6/245). I have also spent some time at DeePhi Tech (Xilinx), Sensetime , Microsoft Research Asia, XPeng, ByteDance, PhiGent Robtics, Gaussian Robotics and Apple. We are currently recruiting doctoral and master's degree students who will graduate in 2025. If you are interested in 3D vision or computer graphics, please feel free to contact me. Email / Google Scholar / Github / Twitter / Curriculum Vitae profile photo

News

Selected Publications

* indicates equal contribution

dise GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation Chubin Zhang, Hongliang Song, Yi Wei, Yu Chen, Jiwen Lu , Yansong Tang Conference on Neural Information Processing Systems (NeurIPS), 2024 [Project page] [arXiv] [Code] We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
dise OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments Chubin Zhang*, Juncheng Yan*, Yi Wei*, Jiaxin Li, Li Liu, Yansong Tang, Yueqi Duan, Jiwen Lu arXiv, 2023 [Project page] [arXiv] [Code] We propose an OccNeRF method for self-supervised multi-camera occupancy prediction, which adopts the parameterized occupancy fields, multi-frame photometric loss and open-vocabulary 2D segmentation.
dise Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior Fangfu Liu, Diankun Wu, Yi Wei, Yongming Rao , Yueqi Duan IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024 [Project page] [arXiv] [Code] We propose Sherpa3D, a new text-to-3D framework that achieves high-fidelity, generalizability, and geometric consistency simultaneously.
dise SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving Yi Wei*, Linqing Zhao*, Wenzhao Zheng, Zheng Zhu, Jie Zhou , Jiwen Lu IEEE International Conference on Computer Vision (ICCV), 2023 [Project page] [arXiv] [Code] We propose a SurroundOcc method to predict the volumetric occupancy with multi-camera images and generate dense occupancy ground truth with sparse LiDAR points.
dise OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception Xiaofeng Wang*, Zheng Zhu*, Wenbo Xu*, Yunpeng Zhang, Yi Wei, Xu Chi, Yun Ye, Dalong Du, Jiwen Lu , Xingang Wang IEEE International Conference on Computer Vision (ICCV), 2023 [arXiv] [Code] Towards a comprehensive benchmarking of surrounding perception algorithms, we propose OpenOccupancy, which is the first surrounding semantic occupancy perception benchmark.
dise 3D Point-Voxel Correlation Fields for Scene Flow Estimation Ziyi Wang*, Yi Wei*, Yongming Rao , Jie Zhou , Jiwen Lu IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI, IF: 24.31), 2023 [Paper] [Code] We propose Deformable PV-RAFT, where the Spatial Deformation deforms the voxelized neighborhood, and the Temporal Deformation controls the iterative update process.
dise Depth-Guided Optimization of Neural Radiance Fields for Indoor Multi-View Stereo Yi Wei, Shaohui Liu, Jie Zhou , Jiwen Lu IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI, IF: 24.31), 2023 [Paper] [Code] Beyond NerfingMVS, we further present NerfingMVS++, where a coarse-to-fine depth priors training strategy is proposed to directly utilize sparse SfM points and the uniform sampling is replaced by Gaussian sampling to boost the performance.
dise LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection Yi Wei, Zibu Wei, Yongming Rao, Jiaxin Li, Jiwen Lu , Jie Zhou European Conference on Computer Vision (ECCV), 2022 [arXiv] [Code] [中文解读] We propose the LiDAR Distillation to bridge the domain gap induced by different LiDAR beams for 3D object detection.
dise SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation Yi Wei*, Linqing Zhao*, Wenzhao Zheng, Zheng Zhu, Yongming Rao, Guan Huang, Jiwen Lu , Jie Zhou Conference on Robot Learning (CoRL), 2022 [Project page] [arXiv] [Code] [中文解读] We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict scale-aware depth maps across cameras.
dise NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo Yi Wei, Shaohui Liu, Yongming Rao, Wang Zhao, Jiwen Lu , Jie Zhou IEEE International Conference on Computer Vision (ICCV), 2021, Oral Presentation [Project page] [arXiv] [Code] [Video] [中文解读] We present a new multi-view depth estimation method that utilizes both conventional SfM reconstruction and learning-based priors over the recently proposed neural radiance fields (NeRF).
dise A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo Wang Zhao*, Shaohui Liu*, Yi Wei , Hengkai Guo , Yong-jin Liu IEEE International Conference on Computer Vision (ICCV), 2021 [Project page] [arXiv] [Code] We propose a novel solver that iteratively solves for per-view depth map and normal map by optimizing an energy potential based on the locally planar assumption.
dise PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds Yi Wei*, Ziyi Wang*, Yongming Rao *, Jiwen Lu , Jie Zhou IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021 [arXiv] [Code] [Video] We present point-voxel correlation fields for 3D scene flow estimation which migrates the high performance of RAFT and provides a solution to build structured all-pairs correlation fields for unstructured point clouds.
dise FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection Yi Wei, Shang Su, Jiwen Lu , Jie Zhou IEEE International Conference on Robotics and Automation (ICRA), 2021 [arXiv] [Code] [Video] We propose a weakly supervised 3D detection method without using 3D labels, which consists of coarse 3D segmentation and 3D bounding box estimation two stages.
dise Conditional Single-view Shape Generation for Multi-view Stereo Reconstruction Yi Wei*, Shaohui Liu *, Wang Zhao *, Jiwen Lu , Jie Zhou IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019 [Project] [arXiv] [Code] we present a new perspective towards image-based shape generation. Unlike most single-view methods which are sometimes insufficient to determine a single groundtruth shape because the back part is occluded, our method levergae multi-view consistency for 3D reconstruction.
dise Quantization mimic: Towards very tiny cnn for object detection Yi Wei, Xinyu Pan , Hongwei Qin , Junjie Yan European Conference on Computer Vision (ECCV), 2018 [arXiv] we propose a simple and general framework for training very tiny CNNs for object detection. Our method leverages the fact that mimic and quantization can facilitate each other.
dise Two-stream binocular network: Accurate near field finger detection based on binocular images Yi Wei, Guijin Wang , Cairong Zhang , Hengkai Guo , Xinghao Chen , Huazhong Yang , IEEE Visual Communications and Image Processing (VCIP), 2017 (Best Student Paper Award) [arXiv] We propose the Two-Stream Binocular Network (TSBnet) to detect fingertips from binocular images. Different with previous depth-based methods, we directly regress 3D positions of fingertip from left and right images.
dise Apple AI/ML Group, Research Intern Topic: 3D AIGC
dise Gaussian Robotics Gaussian-Tsinghua joint laboratory, Project leader Topic: Sensor calibration, Drivable space detection, LiDAR-based 3D object detection, Depth estimation, 3D reconstruction
dise ByteDance SLAM & 3D Vision Group, Engineer&Research Intern Topic: Sky AR, Advertisement AR, Self-supervised depth estimation, Plane-assisted multi-view stereo, Multiple plane detection
dise XPeng LiDAR Group, Engineer Intern Topic: LiDAR-based 3D object detection, LiDAR-based model quantization
dise MSRA Intelligent Multimedia Group, Research Intern Topic: Multi-view hand pose estimation
dise Sensetime Video Intelligence Group, Engineer&Research Intern Topic: Model compression
dise Deephi Engineer Intern Topic: Real-time object detection

Honors and Awards

Academic Services