Frequently Asked Questions · Issue #109 · IDEA-Research/detrex (original) (raw)
We keep this issue open to collect frequently asked questions and their solutions from the users.
Feel free to leave your comment here if you find any frequent issues and have ways to help others to solve them.
Notes
- If you meed some convergence problem with less gpus, it's better to set a larger batch-size (batch-size=8/16) by setting
dataloader.train.total_batch_size
for training as mentioned in this issue: Convergence problem on coco with less gpus. #219
FAQs
1. ImportError: Cannot import 'detrex._C', therefore 'MultiScaleDeformableAttention' is not available.
detrex need CUDA runtime to build the MultiScaleDeformableAttention
operator. In most cases, users do not need to specify this environment variable if you have installed cuda correctly. The default path of CUDA runtime is usr/local/cuda
. If you find your CUDA_HOME
is None
. You may solve it as follows:
- If you've already installed CUDA runtime in your environments, specify the environment variable (here we take cuda-11.3 as an example):
export CUDA_HOME=/path/to/cuda-11.3/
- If you do not find the CUDA runtime in your environments, consider install it following the CUDA Toolkit Installation to install CUDA. Then specify the environment variable
CUDA_HOME
. - After setting
CUDA_HOME
, rebuild detrex again by runningpip install -e .
You can also refer to these issues for more details: #98, #85
2. How to not filter empty annotations during training.
There're three ways for you to not filter empty annotations during training.
- modify configs in configs/common/data/coco_detr.py as follows:
dataloader.train = L(build_detection_train_loader)( dataset=L(get_detection_dataset_dicts)(names="coco_2017_train", filter_empty=False), ..., )
- modify configs in projects as dino_r50_4scale_24ep.py.
your config.py
dataloader = get_config("common/data/coco_detr.py").dataloader
modify dataloader config
not filter empty annotations during training
dataloader.train.dataset.filter_empty = False
- modify your training scripts to override the config.
cd detrex python tools/train_net.py --config-file projects/dino/configs/path/to/config.py --num-gpus 8 dataloader.train.dataset.filter_empy=False
You can also refer to these issues for more details: #78 (comment)
3. RuntimeError: The server socket has failed to listen on any local network address. The server socket has failed to bind to [::]:54980 (errno: 98 - Address already in use).
This means that the process you started earlier did not exit correctly, there's two solution:
- kill the process you started before totally
- change the running port by setting
--dist-url
python tools/train_net.py
--config-file path/to/config.py
--num-gpus 8
--dist-url tcp://127.0.0.1:12345 \
4. DINO CPU inference Please refer to this PR #157 for more details 5. Training coco-like custom dataset Please refer to this PR #186 for more details.