Adaptive Bag-of-Visual Word Modelling using Stacked-Autoencoder and Particle Swarm Optimisation for the Unsupervised Categorisation of Images (original) (raw)

Abstract

The Bag-of-Visual Words has been recognised as an effective mean of representing images for image classification. However, its reliance on a visual codebook developed using Hand Crafted image feature extraction algorithms and vector quantisation via k-means clustering often results in significant computational overhead, and poor classification accuracies. Therefore, this paper presents an adaptive Bag-of-Visual Word Modelling in which Image Feature Extraction is achieved using Deep Feature Learning and the amount of computation required for the development of Visual Codebook is minised using a batch implementation of Particle Swarm Optimisation. The proposed method is tested using Caltech 101 image dataset, and the results confirm the suitability of the proposed method in improving the categorisation performance while reducing the computational load.

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (79)

  1. Deljooi, H., Jassbi, S.J.: 'A multi-criteria decision-making-based approach for semantic image annotation', Int. J. Comput.-Aided Technol., 2015, 2, (1), pp. 17-30
  2. Deljooi, H., Eskandari, A.: 'A novel semantic statistical model for automatic image annotation using the relationship between the regions based on multi- criteria decision making', Int. J. Electr. Comput. Eng., 2014, 4, (1), pp. 37-51
  3. Sahlani, H., Hourali, M.: 'A novel semantic statistical model for automatic image annotation using ontology', Majlesi J. Multimed. Process., 2015, 4, (2), pp. 1-10
  4. Mookdarsanit, P., Mookdarsanit, L.: 'An automatic image tagging of Thai dance's gestures'. Joint Conf. ACTIS & NCOBA, Pranakhon Si Ayutthaya, Thailand, 2018
  5. Zhang, D., Islam, M.M., Lu, G.: 'A review on image annotation techniques', Pattern Recognit., 2012, 45, (1), pp. 345-362
  6. Cao, J., Wu, C., Chen, L., et al.: 'An improved convolutional neural network algorithm and its application in multilabel image labeling', Comput. Intell. Neurosci., 2019, 2019, pp. 1-12
  7. Russakovsky, O., Deng, J., Su, H., et al.: 'ImageNet large-scale visual recognition challenge', Cornell University, 2015
  8. Olaode, A., Naghdy, G.: 'Local image feature extraction using stacked autoencoder in the bag-of-visual word modelling of images'. Fifth IEEE Int. Conf. Computer and Communication, Chengdu, People's Republic of China, 2019
  9. Deng, J., Berg, A.C., Li, K., et al.: 'What does classifying more than 10,000 image categories tell us?'. European Conf. Computer Vision, Crete, Greece, 2010
  10. Olaode, A., Naghdy, G., Todd, C.: 'Unsupervised classification of images: a review', Int. J. Image Process., 2014, 8, (5), pp. 325-342
  11. Zazzaro, G., Martone, A.: 'ECF-means -ensemble clustering fuzzification means'. Eighth Int. Conf. Advances in Information Mining and Management, Barcelona, Spain, 2018
  12. Huang, Y., Liu, Q., Lv, F., et al.: 'Unsupervised image categorization by hypergraph partition', IEEE Trans. Pattern Anal. Mach. Intell., 2011, 33, (6), pp. 1266-1273
  13. Zhang, W., Wang, X., Zhao, D., et al.: 'Graph degree linkage: agglomerative clustering on a directed graph'. European Conf. Computer Vision, Florence, Italy, 2012
  14. Kim, G., Faloutsos, C., Hebert, M.: 'Unsupervised modeling of object categories using link analysis techniques'. IEEE Conf. Computer Vision and Pattern Recognition, Anchorage, AL, USA, 2008
  15. Balakrishnan, R., Kumar, K.: 'An application of genetic algorithm with iterative chromosomes for image clustering problems', Int. J. Comput. Sci., 2012, 9, (1), pp. 60-67
  16. Datta, R., Joshi, D., Li, J., et al.: 'Image retrieval: ideas, influences, and trends of the new age', ACM Comput. Surv., 2008, 40, (2), p. Article 5, pp. 1- 60
  17. Wang, H.H., Mohamad, D., Ismail, N.A.: 'Semantic gap in CBIR: automatic objects spatial relationships semantic extraction and representation', Int. J. Image Process. (IJIP), 2010, 4, (3), pp. 192-204
  18. Xu, J., Li, H., Liu, P., et al.: 'A novel hyperspectral image clustering method with context-aware unsupervised discriminative extreme learning machine', IEEE Access, 2018, 6, pp. 16176-16188
  19. Baig, F., Rashid, M., Javid, M.A., et al.: 'Boosting the performance of the BoVW model using SURF-CoHOG-based sparse features with relevance feedback for CBIR', Iran. J. Sci. Technol. Trans. Electr. Eng., 2019, 1, (4), pp. 1-20
  20. Faheema, A., Subrata, R.: 'Feature selection using bag-of-visual words representation', Centre for AI and Robotics (CAIR), Bangalore, India, 2010
  21. Tirilly, P., Claveau, V., Gros, P.: 'Language modelling for bag-of-visual words image categorisation'. CNRS-IRSA Image Processing and Computer Vision, Rennes, France, 2008
  22. Wu, J., Tan, W.-C., Rehg, J.M.: 'Efficient and effective visual codebook generation using additive kernels', J. Mach. Learn. Res., 2011, 12, pp. 3097- 3118
  23. Zhaoyin, J., Tsuham, C., Yimeng, Z.: 'Image retrieval with geometry- preserving visual phrases', School of Electrical and Computer Engineering, Cornell University, 2010
  24. Srinivas, K., Srikanth, V.: 'A scientific approach for segmentation and clustering technique of improved K-means and neural networks', Int. J. Adv. Res. Comput. Sci. Softw. Eng., 2012, II, (7), pp. 183-189
  25. Guo, J., Qiu, Z., Gurrin, C.: 'Exploring the optimal visual vocabulary sizes for semantic concept detection'. Int. Workshop on Content based Multimedia Indexing (CBMI), Veszprem, Hungary, 2013
  26. Rane, R., Khadse, B.K., Suralkar, S.R.: 'A review of object recognition using visual codebook', Int. J. Comput. Sci. Mob. Comput., 2013, II, (2), pp. 74-79
  27. Van Gemert, J.C., Snoek, C.G., Veenman, C.J., et al.: 'Comparing compact codebooks for visual categorization', Comput. Vis. Image Underst., 2010, 114, pp. 450-462
  28. Suhail, Z., Mahmood, A., Denton, E., et al.: 'Bag-of-visual words based approach for the classification of benign and malignant masses in mammograms using voting-based feature encoding'. 14th Int. Workshop on Breast Imaging (IWBI 2018), Atlanta, GA, USA, 2018
  29. Wang, R., Ding, K., Yang, J., et al.: 'A novel method for image classification based on bag of visual words', J. Vis. Commun. Image Represent., 2016, 40, pp. 24-33
  30. Chanti, D., Caplier, A.: 'Improving bag-of-visual words towards effective facial expressive image classification'. 13th Int. Joint Conf. Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018), Madeira, Portugal, 2018
  31. Mandal, R., Roy, P.P., Pal, U., et al.: 'Bag-of-visual words for signature-based multi-script document retrieval', Cornell University, 2018
  32. Law, M., Thome, N., Cord, M.: 'Bag-of-words image representation: key ideas and further insight', in (Eds.): 'Fusion in computer vision, advances in computer vision and pattern recognition' (Springer International Publishing, Switzerland, 2014), pp. 29-52
  33. Singhal, N., Singhal, N., Kalaichelvi, V.: 'Image classification using bag of visual words model with FAST and FREAK'. 2017 Second Int. Conf. Electrical, Computer and Communication Technologies (ICECCT), Coimbatore, France, 2017
  34. Jiang, J., Wu, D., Jiang, Z.: 'A correlation-based bag of visual words for image classification'. 2017 IEEE Third Information Technology and Mechatronics Engineering Conf. (ITOEC), Chongqing, People's Republic of China, 2017
  35. Li, W., Dong, Y.: 'Scene classification based on the bag-of-visual words and Doc2Vec models for high-spatial resolution remote-sensing imagery', J. Appl. Remote Sens., 2019, 13, (2), art no. 026506
  36. Xu, K., Yang, W., Liu, G., et al.: 'Unsupervised satellite image classification using Markov field topic model', IEEE Geosci. Remote Sens. Lett., 2013, 10, (1), pp. 130-134
  37. Bosch, A., Zisserman, A., Munoz, A.X.: 'Scene classification via PLSA', Computer Vision and Robotics Group, University of Girona, Girona, Spain, 2006
  38. Olukanmi, P., Nelwamondo, F., Marwala, T.: 'k-means-lite: real time clustering for large datasets'. IEEE Fifth Int. Conf. Soft Computing and Machine Intelligence, Nairobi, Kenya, 2018
  39. Tsai, C.-F.: 'Two strategies for bag-of-visual words feature extraction'. Seventh Int. Congress on Advanced Applied Informatics (IIAI-AAI), Yonago, Japan, 2018
  40. Wan, J., Wang, D., Hoi, S.C., et al.: 'Deep learning for content-based image retrieval: a comprehensive study'. IEEE Conf. Computer Vision and Pattern Recognition Workshops, Orlando, FL, USA, 2015
  41. Najafabadi, M.N., Villanustre, F., Khoshgoftaar, T.M., et al.: 'Deep learning applications and challenges in big data analytics', J. Big Data, 2015, 2, (1), pp. 1-21
  42. Karpathy, A., Fei-Fei, L.: 'Deep visual semantic alignments for generating image descriptions', IEEE Trans. Pattern Anal. Mach. Intell., 2016, 39, (4), pp. 664-676
  43. Bengio, Y., Courville, A., Vincent, P.: 'Representation learning: a review and new perspectives', IEEE Trans. Pattern Anal. Mach. Intell., 2014, 35, (8), pp. 1798-1828
  44. Wu, Y., Razavi, R.: 'An introduction to deep learning: examining the advantages of hierarchical learning', Predictive Analytics, Santa Barbara, CA, USA, 2015
  45. Zhang, J., He, Z., Zhang, J., et al.: 'Cograph regularized collective non- negative matrix factorization for multilabel image annotation', IEEE Access, 2019, 7, pp. 88338-88356
  46. Krizhevsky, A., Sutskever, I., Hinton, G.: 'ImageNet classification with deep convolutional neural networks', Adv. Neural. Inf. Process. Syst., 2012, 25, pp. 1106-1114
  47. Patel, T., Kapadia, M., Maisuria, J.: 'A review on content-based image retrieval', Int. J. Comput. Appl., 2015, 132, (13), pp. 22-25
  48. Wang, S., Ding, Z., Fu, Y.: 'Feature selection guided auto-encoder'. 31st AAAI Conf. Artificial Intelligence (AAAI-17), San Francisco, CA, USA, 2017
  49. Chu, W., Cai, D.: 'Stacked similarity-aware autoencoders'. Proc. 26th Int. Joint Conf. Artificial Intelligence (IJCAI-17), Melbourne, Australia, 2017
  50. Hinton, G.E., Salakhutdinov, R.: 'Reducing the dimensionality of data with neural networks', Science, 2006, 313, pp. 504-507
  51. Xu, Q., Zhang, C., Zhang, L., et al.: 'The learning effect of different hidden layers stacked autoencoder'. Eighth Int. Conf. Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, People's Republic of China, 2016
  52. Wu, Z., Junqing, Y.: 'A multi-level descriptor using ultra-deep feature for image retrieval', J. Multimed. Tools Appl., 2019, 78, (318), pp. 1-18 [53] Wu, W., Sun, D.: 'Multiple deep CNN for image annotation'. Tenth Int. Conf. Graphics and Image Processing, Chengdu, People's Republic of China, 2018
  53. Hsu, C.-C., Lin, C.-W.: 'Unsupervised convolutional neural networks for large-scale image clustering'. IEEE Int. Conf. Image Processing, Beijing, People's Republic of China, 2017
  54. Gui, L.-Y., Gui, L., Wang, Y.-X., et al.: 'Factorized convolutional networks: unsupervised fine-tuning for image clustering'. IEEE Winter Conf. Applications of Computer Vision, Lake Tahoe, NV, USA, 2018
  55. Radenovic, F., Tolias, G., Chum, O.: 'CNN image retrieval learns from BOW: unsupervised fine-tuning with hard examples'. European Conf. Computer Vision, Amsterdam, Netherlands, 2016
  56. Bay, H., Tuytelaars, T., Gool, L.V.: 'SURF: speeded-up robust features', ETH Zurich, Zurich, Switzerland, 2005
  57. Shinde, T.S., Tiwari, A.K.: 'Pruning SIFT & SURF for efficient clustering of near-duplicate images'. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019
  58. Balayil, M., Kumar, G.S., Anees, V.M.: 'Automatic multilabelling of images and semantic relation extraction'. Intelligent Signal Processing Conf. (ICISP), Cherbourg, France, 2018
  59. Li, Y., Xu, Y., Wang, J., et al.: 'MS-RMAC: multiscale regional maximum activation of convolutions for image retrieval', IEEE Signal Process. Lett., 2017, 24, (5), pp. 609-613
  60. EL Agha, M., Ashour, W.: 'Efficient and fast initialisation algorithm for k- means clustering', Int. J. Intell. Syst. Appl., 2012, I, pp. 21-31
  61. Tsai, C.-F.: 'Bag-of-words representation in image annotation: a review', ISRN Artif. Intell., 2012, 2012, pp. 1-19
  62. Jurie, F., Triggs, B.: 'Creating efficient codebooks for visual recognition'. Tenth IEEE Int. Conf. Computer Vision, Beijing, 2005
  63. Hastie, T., Tibshirani, R., Friedman, J.: 'The elements of statistical learning- data mining, inference and prediction', vol. II (Springer, Stanford, 2008, 2nd edn.), pp. 465-576
  64. Salvador, S., Chan, P.: 'Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms'. 16th IEEE Int. Conf. Tools with Artificial Intelligence, Boca Raton, FL, USA, 2004
  65. Yan, M.: 'Methods of determining the number of clusters in a data set and a new clustering criterion', Virginia Polytechnic Institute and State University, Blacksburg, VI, USA, 2005
  66. Jun, Y., Chong-Wah, N., Alexander, G.H., et al.: 'Representations of keypoint-based semantic concept detection: a comprehensive study', City University of Hong Kong, Hong Kong, People's Republic of China, 2008
  67. Battiato, S., Farinella, G.M., Meccio, T., et al.: 'Bags of phrases with codebooks alignment for near-duplicate image detection'. Multimedia in Forensics, Security and Intelligence, Florence, Italy, 2010
  68. Olaode, A., Naghdy, G., Todd, C.: 'Bag-of-visual words codebook development for the semantic content-based annotation of images'. Signal Image Technology and Internet based System, Bangkok, Thailand, 2015
  69. Olaode, A.A., Naghdy, G., Todd, C.A.: 'Efficient region of interest detection using blind image division'. Signal Processing Symp., Debe, Poland, 2015
  70. Pelleg, D., Moore, A.: 'X-means: extending k-means with efficient estimation of the number of clusters'. 17th Int. Conf. Machine Learning, Stanford, CA, USA, 2000
  71. Kersorn, K., Chimlek, S., Poslad, S., et al.: 'Visual content representation using semantically similar visual words', Expert Syst. Appl., 2011, 38, pp. 11472-11481
  72. Tsou, D., MacNish, C.: 'Adaptive particle swarm optimisation for high- dimensional highly convex search spaces'. The 2003 Congress on Evolutionary Computation, Canberra, Australia, 2003
  73. Ballardini, A.L.: 'A tutorial on particle swarm optimization clustering', Cornell University, New York, NY, USA, 2016
  74. Kamel, N., Ouchen, I., Baali, K.: 'A sampling-PSO-K-means algorithm for document clustering'. Seventh Int. Conf. Genetic and Evolutionary Computing (ICGEC), Prague, Czech Republic, 2013
  75. Gong, C., Chen, H., He, W., et al.: 'Improved multi-objective clustering algorithm using particle swarm optimization', PLoS One, 2017, 12, (12), p. e0188815, 5 December 2017. Available at https://journals.plos.org/plosone/ article?id=10.1371/journal.pone.0188815, accessed 2 July 2019
  76. Lazebnik, S., Schmid, C., Ponce, J.: 'Beyond bags of features: spatial pyramid matching for recognizing natural scene categories'. 2006 IEEE Computer Society Conf. Computer Vision and Pattern Recognition, New York, NY, USA, 2006
  77. Hofmann, T.: 'Unsupervised learning by probabilistic latent semantic analysis', Machine learning, 2001 42, pp. 177-196
  78. Olaode, A., Naghdy, G.: 'Elimination of spatial incoherency in bag-of-visual words image representation using visual sentence modelling'. Int. Conf. Image and Vision Computing New Zealand (IVCNZ), Auckland, New Zealand, 2018
  79. Gao, Z., Wang, L., Zhou, L.: 'A probabilistic approach to cross-region matching-based image retrieval', IEEE Trans. Image Process., 2019, 28, (3), pp. 1191-1204