PMVSNet (original) (raw)

Point-based Multi-view Stereo Network

Rui Chen1,3* Songfang Han2,3* Jing Xu1 Hao Su3

(*: indicates joint first authors)

1Tsinghua University 2The Hong Kong University of Science and Technology
3University of California, San Diego

Accepted by ICCV 2019.(Oral)

NEW [Aug, 2019] Paper gets accepted. Code and Data is released.

We introduce Point-MVSNet, a novel point-based deep framework for multi-view stereo (MVS). Distinct from existing cost volume approaches, our method directly processes the target scene as point clouds. More specifically, our method predicts the depth in a coarse-to-fine manner. We first generate a coarse depth map, convert it into a point cloud and refine the point cloud iteratively by estimating the residual between the depth of the current iteration and that of the ground truth. Our network leverages 3D geometry priors and 2D texture information jointly and effectively by fusing them into a feature-augmented point cloud, and processes the point cloud to estimate the 3D flow for each point. This point-based architecture allows higher accuracy, more computational efficiency and more flexibility than cost-volume-based counterparts. Experimental results show that our approach achieves a significant improvement in reconstruction quality compared with state-of-the-art methods on the DTU and the Tanks and Temples dataset.

We gratefully acknowledge the support of an NSF grant IIS-1764078, gifts from Qualcomm, Adobe and support from DMAI corporations.