Transformed KL divergence and dual-branch sampler for UAV image object detection (original) (raw)
References
Sun, Y., Chen, Y., Peng, W., Wang, X., Wang, Q.: Drl: dynamic rebalance learning for adversarial robustness of uav with long-tailed distribution. Comput. Commun. 205, 14–23 (2023) Article Google Scholar
Liu, C., Gao, G., Huang, Z., Hu, Z., Liu, Q., Wang, Y.: Yolc: You only look clusters for tiny object detection in aerial images. IEEE Transactions on Intelligent Transportation Systems, (2024)
Umirzakova, Sabina, Muksimova, Shakhnoza, Mardieva, Sevara, Baxtiyarovich, Murodjon Sultanov, Cho, Young-Im.: Mira-cap: Memory-integrated retrieval-augmented captioning for state-of-the-art image and video captioning. Sensors 24(24), 8013 (2024) Article Google Scholar
Li, C., Yang, T., Zhu, S., Chen, C., Guan, S.: Density map guided object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 190–191, (2020)
Koyun, Onur Can, Keser, Reyhan Kevser, Akkaya, Ibrahim Batuhan, Töreyin, Behçet Uğur.: Focus-and-detect: A small object detection framework for aerial images. Signal Processing: Image Communication 104, 116675 (2022) Google Scholar
Ma, C., Fu, Y., Wang, D., Guo, R., Zhao, X., Fang, J.: Yolo-uav: Object detection method of unmanned aerial vehicle imagery based on efficient multi-scale feature fusion. IEEE Access, (2023)
Chalavadi, Vishnu, Jeripothula, Prudviraj, Datla, Rajeshreddy, Sobhan Babu, Ch., et al.: msodanet: A network for multi-scale object detection in aerial images using hierarchical dilated convolutions. Pattern Recognition 126, 108548 (2022) Article Google Scholar
Wang, J., Xu, C., Yang, W., Yu, L.: A normalized gaussian wasserstein distance for tiny object detection. arXiv preprint arXiv:2110.13389, (2021)
Li, X., Wang, W., Lijun, W., Chen, S., Xiaolin, H., Li, J., Tang, J., Yang, J.: Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection. Adv. Neural. Inf. Process. Syst. 33, 21002–21012 (2020) Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016) Article Google Scholar
Cai, Z., Vasconcelos, N.: Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6154–6162, (2018)
Zhang, H., Chang, H., Ma, B., Wang, N., Chen, X.: Dynamic r-cnn: Towards high quality object detection via dynamic training. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16, pages 260–275. Springer, (2020)
Glenn Jocher. Ultralytics yolov5, (2020)
Wang, C-H., Bochkovskiy, A., Mark Liao, H-Y.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7464–7475, (2023)
Reis, D., Kupec, J., Hong, J., Daoudi, A.: Real-time flying object detection with yolov8. arXiv preprint arXiv:2305.09972, (2023)
Wang, C-Y., Yeh, I-H., Mark Liao, H-Y.: Yolov9: Learning what you want to learn using programmable gradient information. In: European Conference on Computer Vision, pages 1–21. Springer, (2025)
Khanam, R., Hussain, M.: Yolov11: An overview of the key architectural enhancements. arXiv preprint arXiv:2410.17725, (2024)
Wang, K., Liew, J.H., Zou, Y., Zhou, D., Feng, J.: Panet: Few-shot image semantic segmentation with prototype alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9197–9206, (2019)
Li, Y., Wang, T., Kang, B., Tang, S., Wang, C., Li, J., Feng, J.: Overcoming classifier imbalance for long-tail object detection with balanced group softmax. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10991–11000, (2020)
Wang, T., Li, Y., Kang, B., Li, J., Liew, J., Tang, S., Hoi, S., Feng, J.: The devil is in classification: A simple framework for long-tail instance segmentation. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16, pages 728–744. Springer, (2020)
Zhou, B., Cui, Q., Wei, X-S., Chen, Z-M.: Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9719–9728, (2020)
Cui, Y., Jia, M., Lin, T-Y., Song, Y., Belongie, S.: Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9268–9277, (2019)
Lin, T-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pages 2980–2988, (2017)
Li, X., Lv, C., Wang, W., Li, G., Yang, L., Yang, J.: Generalized focal loss: towards efficient representation learning for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3139–3153 (2022) Google Scholar
Tan, J., Wang, C., Li, B., Li, Q., Ouyang, W., Yin, C., Yan, J.: Equalization loss for long-tailed object recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11662–11671, (2020)
Tan, J., Lu, X., Zhang, G., Yin, C., Li, Q.: Equalization loss v2: A new gradient balance approach for long-tailed object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1685–1694, (2021)
Wang, J., Zhang, W., Zang, Y., Cao, Y., Pang, J., Gong, T., Chen, K., Liu, Z., Loy, C.C., Lin, D.: Seesaw loss for long-tailed instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9695–9704, (2021)
Wang, T., Zhu, Y., Zhao, C., Zeng, W., Wang, J., Tang, M.: Adaptive class suppression loss for long-tail object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3103–3112, (2021)
Ren, J., Cunjun, Yu., Ma, X., Zhao, H., Yi, S., et al.: Balanced meta-softmax for long-tailed visual recognition. Adv. Neural. Inf. Process. Syst. 33, 4175–4186 (2020) Google Scholar
Feng, C., Zhong, Y., Huang, W.: Exploring classification equilibrium in long-tailed object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3417–3426, (2021)
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 658–666, (2019)
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-iou loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence 34, 12993–13000 (2020) Article Google Scholar
Zhang, Y.-F., Ren, W., Zhang, Z., Jia, Z., Wang, L., Tan, T.: Focal and efficient iou loss for accurate bounding box regression. Neurocomputing 506, 146–157 (2022) Article Google Scholar
Gevorgyan, Z.: Siou loss: More powerful learning for bounding box regression. arXiv preprint arXiv:2205.12740, (2022)
Zhang, H., Zhang, S.: Shape-iou: More accurate metric considering bounding box shape and scale. arXiv preprint arXiv:2312.17663, (2023)
Tong, Z., Chen, Y., Xu, Z., Yu, R.: Wise-iou: bounding box regression loss with dynamic focusing mechanism. arXiv preprint arXiv:2301.10051, (2023)
Yang, X., Yan, J., Ming, Q., Wang, W., Zhang, X., Tian, Q.: Rethinking rotated object detection with gaussian wasserstein distance loss. In: International Conference on Machine Learning, pages 11830–11841. PMLR, (2021)
Zhang, H., Zhang, S.: Shape-iou: More accurate metric considering bounding box shape and scale. arxiv 2023. arXiv preprint arXiv:2312.17663
Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., Zhang, W., Huang, Q., Tian, Q.: The unmanned aerial vehicle benchmark: Object detection and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pages 370–386, (2018)
Yang, C., Huang, Z., Wang, N.: Querydet: Cascaded sparse query for accelerating high-resolution small object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13668–13677, (2022)
Jiang, L., Yuan, B., Du, J., Chen, B., Xie, H., Tian, J., Yuan, Z.: Mffsodnet: Multi-scale feature fusion small object detection network for uav aerial images. IEEE Transactions on Instrumentation and Measurement, (2024)
Fan, Q., Li, Y., Deveci, M., Zhong, K., Kadry, S.: Lud-yolo: a novel lightweight object detection network for unmanned aerial vehicle. Inf. Sci. 686, 121366 (2025) Article Google Scholar
Zhang, Y., Kang, B., Hooi, B., Yan, S., Feng, J.: Deep long-tailed learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45(9), 10795–10816 (2023) Article Google Scholar
Jocher, G., Qiu, J.: Ultralytics yolo11, (2024)
Mao, G.T., Deng, T.M., Yu, N.J.: Object detection in uav images based on multi-scale split attention. Acta Aeronaut. Astronaut. Sin 43(12), 326738 (2022) Google Scholar
Ge, Z.: Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, (2021)
Tan, S., Duan, Z., Longzhong, P.: Multi-scale object detection in uav images based on adaptive feature fusion. PLoS One 19(3), e0300120 (2024) Article Google Scholar
Anggraini, N., Ramadhani, S.H., Wardhani, L.K., Hakiem, N., Shofi, I.M., Tabah Rosyadi, M.: Development of face mask detection using ssdlite mobilenetv3 small on raspberry pi 4. In: 2022 5th International Conference of Computer and Informatics Engineering (IC2IE), pages 209–214. IEEE, (2022)
Dai, Y., Zhao, P., Wang, Y.: Maturity discrimination of tobacco leaves for tobacco harvesting robots based on a multi-scale branch attention neural network. Comput. Electron. Agric. 224, 109133 (2024) Article Google Scholar
Wu, Y., Tang, Y., Yang, T.: An improved nighttime people and vehicle detection algorithm based on yolo v7. In: 2023 3rd International Conference on Neural Networks, Information and Communication Engineering (NNICE), pages 266–270. IEEE, (2023)
Xuan, H., Yang, B., Li, X.: Exploring the impact of temperature scaling in softmax for classification and adversarial robustness. arXiv preprint arXiv:2502.20604, (2025)