Transformer Tracker Based on Multi-level Residual Perception Structure (original) (raw)

References

  1. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fully-Convolutional Siamese Networks for Object Tracking, pp. 850–865. Springer (2016)
    Google Scholar
  2. Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High Performance Visual Tracking with Siamese Region Proposal Network, pp. 8971–8980 (2018)
    Google Scholar
  3. Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: ‘Siamrpn++: Evolution of Siamese Visual Tracking with Very Deep Networks, pp. 4282–4291 (2019)
    Google Scholar
  4. Xu, Y., Wang, Z., Li, Z., Yuan, Y., Yu, G.: Siamfc++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines, pp. 12549–12556 (2020)
    Google Scholar
  5. Chen, X., Yan, B., Zhu, J., Wang, D., Yang, X., Lu, H.: Transformer Tracking, pp. 8126–8135 (2021)
    Google Scholar
  6. Yan, B., Peng, H., Fu, J., Wang, D., Lu, H.: Learning Spatio-temporal Transformer For Visual Tracking, pp. 10448–10457 (2021)
    Google Scholar
  7. Chen, B., et al.: Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking, pp. 375–392. Springer (2022)
    Google Scholar
  8. Ye, B., Chang, H., Ma, B., Shan, S., Chen, X.: Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework, pp. 341–357. Springer (2022)
    Google Scholar
  9. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
    Google Scholar
  10. Yuan, L., et al.: Tokens-to-Token Vit: Training Vision Transformers From Scratch on Imagenet, pp. 558–567 (2021)
    Google Scholar
  11. Yue, X., et al.: Vision Transformer with Progressive Sampling, pp. 387–396 (2021)
    Google Scholar
  12. Hatamizadeh, A., et al.: FasterViT: Fast Vision Transformers with Hierarchical Attention (2023)
    Google Scholar
  13. Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable detr: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)
  14. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: ‘End-to-End Object Detection with Transformers, pp. 213–229. Springer (2020)
    Google Scholar
  15. Lv, W., et al.: Detrs beat yolos on real-time object detection (2023)
    Google Scholar
  16. Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: simple and efficient design for semantic segmentation with transformers. Adv. Neural. Inf. Process. Syst. 34, 12077–12090 (2021)
    Google Scholar
  17. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
  18. Lin, L., Fan, H., Zhang, Z., Xu, Y., Ling, H.: Swintrack: a simple and strong baseline for transformer tracking. Adv. Neural. Inf. Process. Syst. 35, 16743–16754 (2022)
    Google Scholar
  19. Liu, Z., et al.: Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows, pp. 10012–10022 (2021)
    Google Scholar
  20. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
    Article Google Scholar
  21. Guo, D., Wang, J., Cui, Y., Wang, Z., Chen, S.: SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking, pp. 6269–6277 (2020)
    Google Scholar
  22. Chen, Z., et al.: SiamBAN: target-aware tracking with siamese box adaptive network. IEEE Trans. Pattern Anal. Mach. Intell. (2022)
    Google Scholar
  23. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition, pp. 770–778 (2016)
    Google Scholar
  24. Xing, D., Evangeliou, N., Tsoukalas, A., Tzes, A.: Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, pp. 2139–2148 (2022)
    Google Scholar
  25. Law, H., and Deng, J.: ‘Cornernet: Detecting objects as paired keypoints’, (Eds.): ‘Book Cornernet: Detecting objects as paired keypoints’ (2018, edn.), pp. 734–750
    Google Scholar
  26. Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal Loss for Dense Object Detection, pp. 2980–2988 (2017)
    Google Scholar
  27. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression, pp. 658–666 (2019)
    Google Scholar
  28. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A Large-Scale Hierarchical Image Database, pp. 248–255. IEEE (2009)
    Google Scholar
  29. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
  30. Huang, L., Zhao, X., Huang, K.: Got-10k: a large high-diversity benchmark for generic object tracking in the wild. IEEE Trans. Pattern Anal. Mach. Intell. 43(5), 1562–1577 (2019)
    Article Google Scholar
  31. Muller, M., Bibi, A., Giancola, S., Alsubaihi, S., Ghanem, B.: Trackingnet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild, pp. 300–317 (2018)
    Google Scholar
  32. Fan, H., et al.: Lasot: A High-Quality Benchmark for Large-Scale Single Object Tracking, pp. 5374–5383 (2019)
    Google Scholar
  33. Lin, T.-Y., et al.: Microsoft Coco: Common Objects in Context, pp. 740–755. Springer (2014)
    Google Scholar
  34. Mueller, M., Smith, N., Ghanem, B.: A Benchmark and Simulator for UAV Tracking, pp. 445–461. Springer (2016)
    Google Scholar
  35. Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Atom: Accurate Tracking by Overlap Maximization, pp. 4660–4669 (2019)
    Google Scholar
  36. Bhat, G., Danelljan, M., Gool, L.V., Timofte, R.: Learning Discriminative Model Prediction for Tracking, pp. 6182–6191 (2019)
    Google Scholar
  37. Wang, N., Zhou, W., Wang, J., Li, H.: Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking, pp. 1571–1580 (2021)
    Google Scholar
  38. Cao, Z., Fu, C., Ye, J., Li, B., Li, Y.: SiamAPN++: Siamese Attentional Aggregation Network for Real-Time UAV Tracking, pp. 3086–3092. IEEE (2021)
    Google Scholar
  39. Zheng, G., Fu, C., Ye, J., Li, B., Lu, G., Pan, J.: Scale-aware siamese object tracking for vision-based UAM approaching. IEEE Trans. Indust. Inf. (2022)
    Google Scholar
  40. Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast Online Object Tracking and Segmentation: A Unifying Approach, pp. 1328–1338 (2019)
    Google Scholar

Download references