GitHub - wangzy22/TAP: [ICCV 2023] Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models (original) (raw)

Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models

PWC PWC

Created by Ziyi Wang*, Xumin Yu*, Yongming Rao, Jie Zhou, Jiwen Lu.

This repository is a pyTorch implementation of our ICCV 2023 paper TAP (short for Take-A-Photo).

TAP is a generative pre-training method for any point cloud models. Given point cloud features extracted from backbone models, we generate view images from different instructed poses and calculate pixel-wise loss on image pixels as the pre-training scheme. Our pre-training method shows superior results on ScanObjectNN classification and ShapeNetPart segmentation than other generative pre-training methods based on mask modeling.

[arXiv][Project Page]

intro

Preparation

Installation Prerequisites

conda create -n tap python=3.7 numpy=1.20 numba
conda activate tap
conda install -y pytorch=1.10.1 torchvision cudatoolkit=11.3 -c pytorch -c nvidia
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.1+cu113.html
pip install -r requirements_openpoints.txt
pip install open3d einops opencv-python
pip install timm==0.5.4

cd openpoints/cpp/pointnet2_batch
python setup.py install
cd ../pointops
python setup.py install
cd ../subsampling
python setup.py install
cd ../../..

Data Preparation

ShapeNet Dataset

ScanObjectNN Dataset

ShapeNetPart Dataset

Data File Structure

TAP/ |-- data/ |-- ShapeNet55/ |-- poinyclouds |-- train/ |-- model_000003.ply |-- ... |-- val/ |-- model_000009.ply |-- ... |-- test/ |-- model_000001.ply |-- ... |-- shapenet55v1/ |-- train/ |-- model_000003_001.jpg |-- ... |-- val/ |-- model_000009_001.jpg |-- ... |-- test/ |-- model_000001_001.jpg |-- ... |-- ScanObjectNN/ |-- main_split/ |-- training_objectdataset.h5 |-- test_objectdataset.h5 |-- training_objectdataset_augmentedrot_scale75.h5 |-- test_objectdataset_augmentedrot_scale75.h5 |-- main_split_nobg/ |-- training_objectdataset.h5 |-- test_objectdataset.h5 |-- ShapeNetPart/ |-- shapenetcore_partanno_segmentation_benchmark_v0_normal/ |-- 02691156/ |-- 1a04e3eab45ca15dd86060f189eb133.txt |-- ... |-- ... │── train_test_split/ │── synsetoffset2category.txt

Usage

Pre-train on ShapeNet Dataset

python examples/classification/main.py --cfg cfgs/shapenet/BASEMODEL_pretrain.yaml

For example, to pre-train PointMLP model, just replace BASEMODEL with pointmlp:

python examples/classification/main.py --cfg cfgs/shapenet/pointmlp_pretrain.yaml

Finetune on Downstream Tasks

First modify the pretrained_path in finetune configs. Then run following command:

python examples/classification/main.py --cfg cfgs/DATASET/BASEMODEL_finetune.yaml

For example, to finetune PointMLP model on ScanObjectNN Hardest dataset, just replace DATASET with scanobjectnn and replace BASEMODEL with pointmlp:

python examples/classification/main.py --cfg cfgs/scanobjectnn/pointmlp_finetune.yaml

Results

Quantitative Results

Classification on ScanObjectNN

Point Model TAP Pre-trained OBJ_BG OBJ_ONLY PB_T50_RS
DGCNN [Google / Tsinghua Cloud] -- -- 86.6 [Google / Tsinghua Cloud]
PointNet++ [Google / Tsinghua Cloud] -- -- 86.8 [Google / Tsinghua Cloud]
PointMLP [Google / Tsinghua Cloud] -- -- 88.5 [Google / Tsinghua Cloud]
Transformer [Google / Tsinghua Cloud] 90.4 [Google / Tsinghua Cloud] 89.5 [Google / Tsinghua Cloud] 85.7 [Google / Tsinghua Cloud]

Part Segmentation on ShapeNetPart

Point Model TAP Pre-trained mIoU_C / mIoU_I
PointMLP [Google / Tsinghua Cloud] 85.2 / 86.9 [Google / Tsinghua Cloud]

Citation

If you find our work useful in your research, please consider citing:

@article{wang2023tap,
title={Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models},
author={Wang, Ziyi and Yu, Xumin and Rao, Yongming and Zhou, Jie and Lu, Jiwen},
journal={arXiv preprint arXiv:2307.14971},
year={2023}
}

Acknowledgements

Our code is inspired by PointNeXt.