Time-constrained adversarial attacks for video recognition models: temporally sparse but effective perturbations (original) (raw)

References

Kumar, R., Ranjan, P., Jung, K.-H., et al.: Leveraging rans for synchronized high capacity reversible data hiding in encrypted image. Expert Syst. Appl. 267, 126181 (2025)
Article Google Scholar
Meena, P., Singla, P., Ranjan, P.: Enhanced phishing url detection through stacked machine learning model, in: 2024 International Conference on Intelligent Systems for Cybersecurity (ISCS), IEEE, pp. 1–6 (2024)
Ankur, R., Kumar, A.K., Sharma, P. Ranjan.: High-capacity reversible data hiding in encrypted images based on difference image transfiguration. Signal, Image Video Process. 19(5), 365 (2025)
Article Google Scholar
Ranjan, P., Kaushal, A., Girdhar, A., Kumar, R.: Revolutionizing hyperspectral image classification for limited labeled data: unifying autoencoder-enhanced gans with convolutional neural networks and zero-shot learning. Earth Sci. Inf. 18(2), 216 (2025)
Article Google Scholar
Ranjan, P., Kumar, R., Girdhar, A.: Recent cnn advancements for stratification of hyperspectral images, in: 2023 6th International conference on information systems and computer networks (ISCON), IEEE, pp. 1–5 (2023)
Ranjan, P., Girdhar, A.: A comparison of deep learning algorithms dealing with limited samples in hyperspectral image classification, in: 2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON), IEEE, , pp. 1–6 (2023)
Ranjan, P., Kumar, R., Girdhar, A.: Unlocking the potential of unlabeled data: semi-supervised learning for stratification of hyperspectral images, in: 2023 OITS International Conference on Information Technology (OCIT), IEEE, pp. 938–943 (2023)
Ranjan, P., Kumar, R., Jung, K.-H.: Exploring cutting edge of ai in hyperspectral image classification, 대한전자공학회 학술대회 2525–2530 (2024)
Ranjan, P., Girdhar, A.: Deep siamese network with handcrafted feature extraction for hyperspectral image classification. Multimed. Tools Appl. 83(1), 2501–2526 (2024)
Article Google Scholar
Tang, Y., Bi, J., Xu, S., Song, L., Liang, S., Wang, T., Zhang, D., An, J., Lin, J., Zhu, R., et al.: Video understanding with large language models: A survey, IEEE Transactions on Circuits and Systems for Video Technology (2025)
Fontes, C., Hohma, E., Corrigan, C.C., Lütge, C.: Ai-powered public surveillance systems: why we (might) need them and how we want them. Technol. Soc. 71, 102137 (2022)
Article Google Scholar
Ghosh, I., Ramasamy Ramamurthy, S., Chakma, A., Roy, N.: Sports analytics review: Artificial intelligence applications, emerging technologies, and algorithmic perspective. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 13(5), e1496 (2023)
Google Scholar
Kosch, T., Karolus, J., Zagermann, J., Reiterer, H., Schmidt, A., Woźniak, P.W.: A survey on measuring cognitive workload in human-computer interaction. ACM Comput. Surv. 55(13s), 1–39 (2023)
Article Google Scholar
Chada, S.K., Görges, D., Ebert, A., Teutsch, R., Subramanya, S.P.: Evaluation of the driving performance and user acceptance of a predictive eco-driving assistance system for electric vehicles. Transp. Res. Part C: Emerg. Technol. 153, 104193 (2023)
Article Google Scholar
Padmakala, S., Al-Fatlawy, R.R., Madhavan, S., Anuradha, K., et al.: Video captioning using inflated 3d convolution network encoder with decoder for video content, in: 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON), IEEE, pp. 1–4 (2024)
Garcia-Granada, A., Lacarac, V., Smith, D., Pavier, M., Cook, R., Holdway, P.: 3d residual stresses around cold expanded holes in a new creep resistant aluminium alloy, WIT Transactions on Engineering Sciences 25 (2025)
Yang, Z., Wang, J., Ye, X., Tang, Y., Chen, K., Zhao, H., Torr, H.P.: Language-aware vision transformer for referring segmentation. IEEE Trans. Pattern Analy. Mach. Intell. 47, 5238–5255 (2025)
Article Google Scholar
Macas, M., Wu, C., Fuertes, W.: Adversarial examples: A survey of attacks and defenses in deep learning-enabled cybersecurity systems. Expert Syst. Appl. 238, 122223 (2024)
Article Google Scholar
Qi, X., Huang, K., Panda, A., Henderson, P., Wang, M., Mittal, P.: Visual adversarial examples jailbreak aligned large language models. In: Proceedings of the AAAI conference on artificial intelligence 38, 21527–21536 (2024)
Roshan, K., Zafar, A., Haque, S.B.U.: Untargeted white-box adversarial attack with heuristic defence methods in real-time deep learning based network intrusion detection system. Comput. Commun. 218, 97–113 (2024)
Article Google Scholar
Wang, Z., Yang, H., Feng, Y., Sun, P., Guo, H., Zhang, Z., Ren, K.: Towards transferable targeted adversarial examples, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 20534–20543 (2023)
Pan, T., Song, Y., Yang, T., Jiang, W., Liu, W.: Videomoco: Contrastive video representation learning with temporally adversarial examples, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11205–11214 (2021)
Wei, Z., Chen, J., Wu, Z., Jiang, Y.-G.: Boosting the transferability of video adversarial examples via temporal translation. In: Proceedings of the AAAI conference on artificial intelligence 36, 2659–2667 (2022)
Li, P., Zhang, Y., Yuan, L., Zhao, J., Xu, X., Zhang, X.: Adversarial attacks on video object segmentation with hard region discovery. IEEE Trans. Circuits Syst. Video Technol. 34(6), 5049–5062 (2023)
Article Google Scholar
Ke, Z., Sun, C., Zhu, L., Xu, K., Lau, R.W.: Harmonizer: Learning to perform white-box image and video harmonization, in: European conference on computer vision, Springer, pp. 690–706 (2022)
Li, S., Aich, A., Zhu, S., Asif, S., Song, C., Roy-Chowdhury, A., Krishnamurthy, S.: Adversarial attacks on black box video classifiers: Leveraging the power of geometric transformations. Adv. Neural. Inf. Process. Syst. 34, 2085–2096 (2021)
Google Scholar
Wei, Z., Chen, J., Wu, Z., Jiang, Y.-G.: Adaptive cross-modal transferable adversarial attacks from images to videos. IEEE Trans. Pattern Anal. Mach. Intell. 46(5), 3772–3783 (2023)
Article Google Scholar
Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime tv-l1 optical flow, in: DAGM Symposium on Pattern Recognition, Vol. 4713 of LNCS, Springer, pp. 214–223 (2007)
Patil, R., Mane, S.B.: User-centric video summarization using i3d and attention mechanisms, in: 2025 Global Conference in Emerging Technology (GINOTECH), IEEE, pp. 1–5 (2025)
Kataoka, H., Wakamiya, T., Hara, K., Satoh, Y.: Would mega-scale datasets further enhance spatiotemporal 3d cnns?, arXiv:2004.04968 (2020)
Carreira, J., Zisserman, A.: Quo vadis, action recognition? a new model and the kinetics dataset, in: CVPR, pp. 6299–6308 (2017)
Kay, W., Carreira, J., Simonyan, K., Zhang, B., Hillier, C., Vijayanarasimhan, S., Viola, F., Green, T., Back, T., Natsev, P., Suleyman, M., Zisserman, A.: The kinetics human action video dataset, arXiv:1705.06950 (2017)
Shah, A.A., Malik, H.A.M., Muhammad, A., Alourani, A., Butt, Z.A.: Deep learning ensemble 2d cnn approach towards the detection of lung cancer. Sci. Rep. 13(1), 2987 (2023)
Article Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks, arXiv preprint arXiv:1312.6199 (2013)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples, in: International Conference on Learning Representations (ICLR) (2015)
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world, arXiv:1607.02533 (2017)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks, in: International Conference on Learning Representations (ICLR) (2018)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks, in: 2017 IEEE Symposium on Security and Privacy (SP), IEEE, pp. 39–57 (2017)
Moosavi-Dezfooli, S.-M., Fawzi, A., Frossard, P.: Deepfool: A simple and accurate method to fool deep neural networks, in: CVPR, pp. 2574–2582 (2016)
Ilyas, A., Engstrom, L., Athalye, A., Lin, J.: Black-box adversarial attacks with limited queries and information, in: Proceedings of the 35th International Conference on Machine Learning (ICML), PMLR, pp. 2137–2146 (2018)
Chen, P.-Y. Zhang, H., Sharma, Y., Yi, J., Hsieh, C.-J.: Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models, in: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec), ACM (2017). https://doi.org/10.1145/3128572.3140448
Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks: Reliable attacks against black-box machine learning models, in: International Conference on Learning Representations (ICLR) (2018)
Doshi, K., et al.: Semantic video transformer for robust action recognition, in: DSC (2023)
Lin, Q., Xie, W., et al.: Boosting adversarial transferability across model genus by decorrelation of weight activation, in: AAAI (2024)
Li, P., et al.: Temporal consistency constrained transferable adversarial attacks for video recognition, in: IJCAI (2025)
Gao, Z., et al.: Retome-va: Recursive token merging for video diffusion-based unrestricted adversarial attack, in: OpenReview (2024)
Dai, X., et al.: Diffusion models as strong adversaries. IEEE Trans. Image Process. 33, 6734–6747 (2024)
Article Google Scholar
Zeng, Y., et al.: Advi2i: Adversarial image attack on image-to-image diffusion models, arXiv:2410.21471 (2024)
Yang, Y., et al.: Mma-diffusion: Multimodal attack on diffusion models, in: CVPR (2024)
Kumar, X., et al.: Adversarial training for multimodal large language models, arXiv:2503.04833 (2025)
Wu, C.H., Koh, J.Y., Salakhutdinov, R., Fried, D., Raghunathan, A.: Adversarial attacks on multimodal agents (2024) arXiv e-prints arXiv–2406
Kapoor, S., et al.: Adversarial attacks in multimodal systems: A practitioner’s survey, in: OpenReview (2025)
Wierstra, D., Schaul, T., Glasmachers, T., Sun, Y., Peters, J., Schmidhuber, J.: Natural evolution strategies, The. J. Mach. Learn. Res. 15(1), 949–980 (2014)
MathSciNet Google Scholar
Imambi, S., Prakash, K.B., Kanagachidambaresan, G.: Pytorch, in: Programming with TensorFlow: solution for edge computing applications, Springer, pp. 87–104 (2021)
Setiadi, D.R.I.M.: Psnr vs ssim: imperceptibility quality assessment for image steganography. Multimed. Tools Appl. 80(6), 8423–8444 (2021)
Article Google Scholar
Bakurov, I., Buzzelli, M., Schettini, R., Castelli, M., Vanneschi, L.: Structural similarity index (ssim) revisited: A data-driven approach. Expert Syst. Appl. 189, 116087 (2022)
Article Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric, in: CVPR, pp. 586–595 (2018)
Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition, in: CVPR, pp. 6450–6459 (2018)
Guo, C., Frank, J., Jung, S., Weinberger, K.Q.: Simple black-box adversarial attacks, in: Proceedings of the 36th International Conference on Machine Learning (ICML) (2019)
Andriushchenko, M., Croce, F., Flammarion, N., Hein, M.: Square attack: A query-efficient black-box adversarial attack via random search, in: ECCV, pp. 484–501 (2020). arXiv:1912.00059
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild, in: CVPR (2009)

Download references