GitHub - salesforce/PB-OVD: A pytorch Implementation of Open Vocabulary Object Detection with Pseudo Bounding-Box Labels (original) (raw)

Open Vocabulary Object Detection with Pseudo Bounding-Box Labels

Introduction

This is an official pytorch implementation of Open Vocabulary Object Detection with Pseudo Bounding-Box Labels.network

Environment

UBUNTU="18.04"
CUDA="11.0"
CUDNN="8"

Installation

conda create --name ovd

conda activate ovd

cd $INSTALL_DIR

bash ovd_install.sh

git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext

cd ../
cuda_dir="maskrcnn_benchmark/csrc/cuda"
perl -i -pe 's/AT_CHECK/TORCH_CHECK/' <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>c</mi><mi>u</mi><mi>d</mi><msub><mi>a</mi><mi>d</mi></msub><mi>i</mi><mi>r</mi><mi mathvariant="normal">/</mi><mi>d</mi><mi>e</mi><mi>f</mi><mi>o</mi><mi>r</mi><msub><mi>m</mi><mi>p</mi></msub><mi>o</mi><mi>o</mi><msub><mi>l</mi><mi>c</mi></msub><mi>u</mi><mi>d</mi><mi>a</mi><mi mathvariant="normal">.</mi><mi>c</mi><mi>u</mi></mrow><annotation encoding="application/x-tex">cuda_dir/deform_pool_cuda.cu </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord mathnormal">c</span><span class="mord mathnormal">u</span><span class="mord mathnormal">d</span><span class="mord"><span class="mord mathnormal">a</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">d</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord">/</span><span class="mord mathnormal">d</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal" style="margin-right:0.02778em;">or</span><span class="mord"><span class="mord mathnormal">m</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">p</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord mathnormal">oo</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0197em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">c</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">u</span><span class="mord mathnormal">d</span><span class="mord mathnormal">a</span><span class="mord">.</span><span class="mord mathnormal">c</span><span class="mord mathnormal">u</span></span></span></span>cuda_dir/deform_conv_cuda.cu
python setup.py build develop

Data Preparation

Inference

python -m torch.distributed.launch --nproc_per_node=8 tools/test_net.py \
--config-file configs/eval.yaml \
MODEL.WEIGHT $PATH_TO_FINAL_MODEL \
OUTPUT_DIR $OUTPUT_DIR
python evaluate_lvis_official.py --coco_anno_path datasets/lvis_v0.5_val_all_clipemb.json \
--result_dir $OUTPUT_DIR/inference/lvis_v0.5_val_all_cocostyle/

Pretrain with Pseudo Labels

python -m torch.distributed.launch --nproc_per_node=16 tools/train_net.py  --distributed \
--config-file configs/pretrain_1m.yaml \
OUTPUT_DIR $OUTPUT_DIR

Finetune

python -m torch.distributed.launch --nproc_per_node=8 tools/train_net.py  --distributed \
--config-file configs/finetune.yaml \
MODEL.WEIGHT $PATH_TO_PRETRAIN_MODEL \
OUTPUT_DIR $OUTPUT_DIR

Generate Your Own Pseudo Box Labels

examples

Installation

conda create --name gen_plabels

conda activate gen_plabels

bash gen_plabel_install.sh

Preparation

Generate Pseudo Labels

python pseudo_bbox_generation.py
python prepare_coco_dataset.py
# pip install git+https://github.com/openai/CLIP.git

python prepare_clip_embedding_for_open_vocab.py
python visualize_coco_style_dataset.py

Citation

@article{gao2021towards, title={Open Vocabulary Object Detection with Pseudo Bounding-Box Labels}, author={Gao, Mingfei and Xing, Chen and Niebles, Juan Carlos and Li, Junnan and Xu, Ran and Liu, Wenhao and Xiong, Caiming}, journal={arXiv preprint arXiv:2111.09452}, year={2021} }

Contact

Notes