Learnable Depth-Sensitive Attention for Deep RGB-D Saliency Detection with Multi-modal Fusion Architecture Search (original) (raw)
References
- Achanta, R., Hemami, S., Estrada, F., & Susstrunk, S. (2009). Frequency-tuned salient region detection. In IEEE conference on computer vision and pattern recognition (pp. 1597–1604).
- Anandalingam, G., & Friesz, T. (1992). Hierarchical optimization: An introduction. Annals of Operations Research, 34, 1–11.
Article MathSciNet Google Scholar - Baker, B., Gupta, O., Naik, N., & Raskar, R. (2017). Designing neural network architectures using reinforcement learning. In International conference on learning representations.
- Bender, G., Kindermans, P., Zoph, B., Vasudevan, V., & Le, Q. V. (2018). Understanding and simplifying one-shot architecture search. In International conference on machine learning.
- Borji, A., Cheng, M. M., Jiang, H., & Li, J. (2015). Salient object detection: A benchmark. IEEE Transactions on Image Processing, 24(12), 5706–5722.
Article MathSciNet Google Scholar - Brock, A., Lim, T., Ritchie, J. M., & Weston, N. (2018). Smash: One-shot model architecture search through hypernetworks. In International conference on learning representations. arxiv: abs/1708.05344.
- Cai, H., Chen, T., Zhang, W., Yu, Y., & Wang, J. (2018). Efficient architecture search by network transformation. In AAAI (Vol. 32).
- Chen, H., Deng, Y., Li, Y., Hung, T. Y., & Lin, G. (2020). Rgbd salient object detection via disentangled cross-modal fusion. IEEE Transactions on Image Processing, 29, 8407–8416.
Article Google Scholar - Chen, H., & Li, Y. (2018). Progressively complementarity-aware fusion network for RGB-D salient object detection. In IEEE conference on computer vision and pattern recognition (pp. 3051–3060).
- Chen, H., & Li, Y. (2019). Three-stream attention-aware network for RGB-D salient object detection. IEEE Transactions on Image Processing, 28, 2825–2835.
Article MathSciNet Google Scholar - Chen, H., Li, Y., & Su, D. (2019). Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection. Pattern Recognition, 86, 376–385.
Article Google Scholar - Chen, H., Li, Y., & Su, D. (2020). Discriminative cross-modal transfer learning and densely cross-level feedback fusion for RGB-D salient object detection. IEEE Transactions on Cybernetics, 50, 4808–4820.
Article Google Scholar - Chen, Q., Liu, Z., Zhang, Y., Fu, K., Zhao, Q., & Du, H. (2021). RGB-D salient object detection via 3d convolutional neural networks. In AAAI.
- Chen, S., & Fu, Y. (2020). Progressively guided alternate refinement network for RGB-D salient object detection. In European conference on computer vision.
- Cheng, Y., Fu, H., Wei, X., Xiao, J., & Cao, X. (2014). Depth enhanced saliency detection method. In ICIMCS (pp. 23–27).
- Chen, Y., Meng, G., Zhang, Q., Xiang, S., Huang, C., Mu, L., & Wang, X. (2018). Reinforced evolutionary neural architecture search. arXiv preprint arXiv:1808.00193.
- Chen, Z., Cong, R., Xu, Q., & Huang, Q. (2020). Dpanet: Depth potentiality-aware gated attention network for RGB-D salient object detection. IEEE Transactions on Image Processing, 30, 7012–7014.
Article Google Scholar - Ciptadi, A., Hermans, T., & Rehg, J.M. (2013). An in depth view of saliency. In British machine vision conference.
- Colson, B., Marcotte, P., & Savard, G. (2007). An overview of bilevel optimization. Annals of Operations Research, 153, 235–256.
Article MathSciNet Google Scholar - Desingh, K., Krishna, K. M., Rajan, D., & Jawahar, C. (2013). Depth really matters: Improving visual salient region detection with depth. In British machine vision conference (pp. 1–11).
- Fan, D. P., Cheng, M. M., Liu, Y., Li, T., & Borji, A. (2017). Structure-measure: A new way to evaluate foreground maps. In International conference on computer vision (pp. 4548–4557).
- Fan, D. P., Gong, C., Cao, Y., Ren, B., Cheng, M. M., & Borji, A. (2018). Enhanced-alignment measure for binary foreground map evaluation. In IJCAI.
- Fan, D. P., Lin, Z., Zhang, Z., Zhu, M., & Cheng, M. M. (2020). Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks. IEEE Transactions on Neural Networks and Learning Systems, 32, 2075–2089.
Article Google Scholar - Fan, D. P., Lin, Z., Zhao, J., Liu, Y., Zhang, Z., Hou, Q., et al. (2020). Rethinking RGB-D salient object detection: Models, datasets, and large-scale benchmarks. IEEE Transactions on neural networks and learning systems, 32, 2075–2089.
Article Google Scholar - Fan, D. P., Wang, W., Cheng, M. M., & Shen, J. (2019). Shifting more attention to video salient object detection. In IEEE conference on computer vision and pattern recognition (pp. 8554–8564).
- Fan, D. P., Zhai, Y., Borji, A., Yang, J., & Shao, L. (2020c). Bbs-net: RGB-D salient object detection with a bifurcated backbone strategy network. In European conference on computer vision.
- Fan, X., Liu, Z., & Sun, G. (2014). Salient region detection for stereoscopic images. In DSP (pp. 454–458).
- Feng, D., Barnes, N., You, S., & McCarthy, C. (2016). Local background enclosure for RGB-D salient object detection. In IEEE conference on computer vision and pattern recognition (pp. 2343–2350).
- Fu, K., Fan, D. P., Ji, G. P., & Zhao, Q. (2020). JL-DCF: Joint learning and densely-cooperative fusion framework for RGB-D salient object detection. In IEEE conference on computer vision and pattern recognition (pp. 3052–3062).
- Fu, K., Fan, D. P., Ji, G. P., Zhao, Q., Shen, J., & Zhu, C. (2021). Siamese network for RGB-D salient object detection and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence.
- Gao, S., Cheng, M. M., Zhao, K., Zhang, X. Y., Yang, M. H., & Torr, P. H. (2019). Res2net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence.
- Gao, Y., Wang, M., Tao, D., Ji, R., & Dai, Q. (2012). 3-d object retrieval and recognition with hypergraph analysis. IEEE Transactions on Image Processing, 21, 4290–4303.
Article MathSciNet Google Scholar - Ghiasi, G., Lin, T. Y., Pang, R., & Le, Q. V. (2019). Nas-fpn: Learning scalable feature pyramid architecture for object detection. In IEEE conference on computer vision and pattern recognition (pp. 7029–7038).
- Guo, J., Ren, T., & Bei, J. (2016). Salient object detection for RGB-D image via saliency evolution. In IEEE international conference on multimedia and expo (pp. 1–6).
- He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
- Hong, S., You, T., Kwak, S., & Han, B. (2015). Online tracking by learning discriminative saliency map with convolutional neural network. In International conference on machine learning.
- Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In IEEE conference on computer vision and pattern recognition (pp. 7132–7141).
- Jang, E., Gu, S., & Poole, B. (2017). Categorical reparameterization with Gumbel-Softmax. In International conference on learning representation.
- Ji, W., Li J, Zhang, M., Piao, Y., & Lu, H. (2020). Accurate RGB-D salient object detection via collaborative learning. In European conference on computer vision.
- Jin, W. D., Xu, J., Han, Q., Zhang, Y., & Cheng, M. M. (2021). Cdnet: Complementary depth network for RGB-D salient object detection. IEEE Transactions on Image Processing, 30, 3376–3390.
Article Google Scholar - Ju, R., Ge, L., Geng, W., Ren, T., & Wu, G. (2014). Depth saliency based on anisotropic center-surround difference. In IEEE international conference on image processing (pp. 1115–1119).
- Lang, C., Nguyen, T. V., Katti, H., Yadati, K., Kankanhalli, M., & Yan, S. (2012). Depth matters: Influence of depth cues on visual saliency. In: European conference on computer vision.
- Li, C., Cong, R., Piao, Y., Xu, Q., & Loy, C. C. (2020a). RGB-D salient object detection with cross-modality modulation and selection. In European conference on computer vision.
- Li, G., Liu, Z., Chen, M., Bai, Z., Lin, W., & Ling, H. (2021). Hierarchical alternate interaction network for RGB-D salient object detection. IEEE Transactions on Image Processing, 30, 3528–3542.
Article Google Scholar - Li, G., Liu, Z., Ye, L., Wang, Y., & Ling, H. (2020b). Cross-modal weighting network for RGB-D salient object detection. In European conference on computer vision.
- Li, N., Ye, J., Ji, Y., Ling, H., & Yu, J. (2014). Saliency detection on light field. In IEEE conference on computer vision and pattern recognition (pp. 2806–2813).
- Lin, P. W., Sun, P., Cheng, G., Xie, S., Li, X., & Shi, J. (2020). Graph-guided architecture search for real-time semantic segmentation. In IEEE conference on computer vision and pattern recognition (pp. 4202–4211).
- Liu, C., Chen, L. C., Schroff, F., Adam, H., Hua, W., Yuille, A., & Fei-Fei, L. (2019a). Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In IEEE conference on computer vision and pattern recognition.
- Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L. J., Fei-Fei, L., Yuille A., Huang, J., & Murphy, K. (2017). Progressive neural architecture search. In European conference on computer vision.
- Liu, G., & Fan, D. P. (2013). A model of visual attention for natural image retrieval. In 2013 international conference on information science and cloud computing companion (pp. 728–733).
- Liu, H., Simonyan, K., & Yang, Y. (2019b). Darts: Differentiable architecture search. In International conference on learning representation.
- Liu, N., Zhang, N., & Han, J. (2020a). Learning selective self-mutual attention for RGB-D saliency detection. In IEEE conference on computer vision and pattern recognition (pp. 13753–13762).
- Liu, N., Zhang, N., Shao, L., & Han, J. (2020b). Learning selective mutual attention and contrast for RGB-D saliency detection. abs/2010.05537.
- Liu, Z., Shi, S., Duan, Q., Zhang, W., & Zhao, P. (2019). Salient object detection for RGB-D image by single stream recurrent convolution neural network. Neurocomputing, 363, 46–57.
Article Google Scholar - Mahadevan, V., & Vasconcelos, N. (2009). Saliency-based discriminant tracking. In IEEE conference on computer vision and pattern recognition (pp. 1007–1013).
- Nguyen, T. V., Zhao, Q., & Yan, S. (2018). Attentive systems: A survey. International Journal of Computer Vision, 126(1), 86–110.
Article Google Scholar - Nian, L., Ni, Z., Kaiyuan, W., Junwei, H., & Ling, S. (2021). Visual saliency transformer. arXiv preprint arXiv:2101.10241.
- Niu, Y., Geng, Y., Li, X., & Liu, F. (2012). Leveraging stereopsis for saliency analysis. In IEEE conference on computer vision and pattern recognition (pp. 454–461).
- Pang, Y., Zhang, L., Zhao, X., & Lu, H. (2020). Hierarchical dynamic filtering network for RGB-D salient object detection. In European conference on computer vision.
- Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai J., & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems.
- Peng, H., Li, B., Xiong, W., Hu, W., & Ji, R. (2014). RGBD salient object detection: A benchmark and algorithms. In European conference on computer vision (pp. 92–109). Springer.
- Pérez-Rúa, J. M., Vielzeuf, V., Pateux, S., Baccouche, M., & Jurie, F. (2019). Mfas: Multimodal fusion architecture search. In IEEE Conference on computer vision and pattern recognition (pp. 6959–6968).
- Piao, Y., Ji, W., Li, J., Zhang, M., Lu, H. (2019). Depth-induced multi-scale recurrent attention network for saliency detection. In European conference on computer vision (pp. 7254–7263).
- Qu, L., He, S., Zhang, J., Tian, J., Tang, Y., & Yang, Q. (2017). RGBD salient object detection via deep fusion. IEEE Transactions on Image Processing, 26, 2274–2285.
Article MathSciNet Google Scholar - Quan, R., Dong, X., Wu, Y., Zhu, L., & Yang, Y. (2019). Auto-reid: Searching for a part-aware convnet for person re-identification. In International conference on computer vision (pp. 3750–3759).
- Real, E., Aggarwal, A., Huang, Y., & Le, Q. V. (2019). Regularized evolution for image classifier architecture search. arXiv:abs/1802.01548.
- Ren, J., Gong, X., Yu, L., Zhou, W., & Ying Yang, M. (2015). Exploiting global priors for RGB-D saliency detection. In IEEE conference on computer vision and pattern recognition. Workshops.
- Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115, 211–252.
Article MathSciNet Google Scholar - Shigematsu, R., Feng, D., You, S., & Barnes, N. (2017). Learning RGB-D salient object detection using background enclosure, depth contrast, and top-down features. In IEEE conference on computer vision. Workshop (pp. 2749–2757).
- Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representation. abs/1409.1556.
- Song, H., Liu, Z., Du, H., Sun, G., Meur, O. L., & Ren, T. (2017). Depth-aware salient object detection and segmentation via multiscale discriminative saliency fusion and bootstrap learning. IEEE Transactions on Image Processing, 26, 4204–4216.
Article MathSciNet Google Scholar - Sun, P., Zhang, W., Wang, H., Li, S., & Li, X. (2021). Deep RGB-D saliency detection with depth-sensitive attention and automatic multi-modal fusion. In IEEE conference on computer vision and pattern recognition.
- Wang, W., Shen, J., & Porikli, F. (2015). Saliency-aware geodesic video object segmentation. In IEEE conference on computer vision and pattern recognition (pp. 3395–3402).
- Xu, H., Yao, L., Li, Z., Liang, X., & Zhang, W. (2019). Auto-fpn: Automatic network architecture adaptation for object detection beyond classification. In IEEE conference on computer vision (pp. 6648–6657).
- Yu, Z., Cui, Y., Yu, J., Wang, M., Tao, D., & Tian, Q. (2020). Deep multimodal neural architecture search. In ACM international conference on multimedia.
- Zhang, J., Fan, D.P., Dai, Y., Yu, X., Zhong, Y., Barnes, N., & Shao, L. (2021). RGB-D saliency detection via cascaded mutual information minimization. In IEEE conference on computer vision (pp. 4338–4347).
- Zhang, M., Fei, S. X., Liu, J., Xu, S., Piao, Y., & Lu, H. (2020a). Asymmetric two-stream architecture for accurate RGB-D saliency detection. In European conference on computer vision.
- Zhang, M., Ren, W., Piao, Y., Rong, Z., & Lu, H. (2020b). Select, supplement and focus for RGB-D saliency detection. In IEEE conference on computer vision and pattern recognition (pp. 3469–3478).
- Zhao, J. X., Cao, Y., Fan, D. P., Cheng, M. M., Li, X. Y., & Zhang, L. (2019). Contrast prior and fluid pyramid integration for RGBD salient object detection. In IEEE conference on computer vision and pattern recognition.
- Zhao, R., Ouyang, W., & Wang, X. (2013). Unsupervised salience learning for person re-identification. In IEEE conference on computer vision and pattern recognition (pp. 3586–3593).
- Zhao, X., Zhang, L., Pang, Y., Lu, H., & Zhang, L. (2020). A single stream network for robust and real-time RGB-D salient object detection. In European conference on computer vision.
- Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In IEEE conference on computer vision and pattern recognition (pp. 2921–2929).
- Zhou, T., Fan, D. P., Cheng, M. M., Shen, J., & Shao, L. (2021). RGB-D salient object detection: A survey. Computational Visual Media, 7(1), 37–69.
Article Google Scholar - Zhu, C., Cai, X., Huang, K., Li, T. H., & Li, G. (2019). Pdnet: Prior-model guided depth-enhanced network for salient object detection. In International conference on multimedia and expo (pp. 199–204).
- Zhu, C., & Li, G. (2017). A three-pathway psychobiological framework of salient object detection using stereoscopic technology. In IEEE conference on computer vision and pattern recognition. Workshop (pp. 3008–3014).
- Zhu, C., Li, G., Wang, W., & Wang, R. (2017). An innovative salient object detection using center-dark channel prior. In IEEE conference on computer vision and pattern recognition (pp. 1509–1515).
- Zoph, B., Le, & Q. V. (2017). Neural architecture search with reinforcement learning. In International conference on learning representation.