Lightweight Attribute Localizing Models for Pedestrian Attribute Recognition (original) (raw)

Deep compression of convolutional neural networks with low-rank approximation

ETRI Journal

The application of deep neural networks (DNNs) to connect the world with cyber physical systems (CPSs) has attracted much attention. However, DNNs require a large amount of memory and computational cost, which hinders their use in the relatively low-end smart devices that are widely used in CPSs. In this paper, we aim to determine whether DNNs can be efficiently deployed and operated in lowend smart devices. To do this, we develop a method to reduce the memory requirement of DNNs and increase the inference speed, while maintaining the performance (for example, accuracy) close to the original level. The parameters of DNNs are decomposed using a hybrid of canonical polyadic-singular value decomposition, approximated using a tensor power method, and fine-tuned by performing iterative one-shot hybrid fine-tuning to recover from a decreased accuracy. In this study, we evaluate our method on frequently used networks. We also present results from extensive experiments on the effects of several fine-tuning methods, the importance of iterative fine-tuning, and decomposition techniques. We demonstrate the effectiveness of the proposed method by deploying compressed networks in smartphones.

Deep Neural Network Compression for Image Classification and Object Detection

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), 2019

Neural networks have been notorious for being computational expensive. This is mainly because neural networks are often over-parametrized and most likely have redundant nodes or layers as they are getting deeper and wider. Their demand for hardware resources prohibits their extensive use in embedded devices and puts restrictions on tasks like real time image classification or object detection. In this work, we propose a network agnostic model compression method infused with a novel dynamical clustering approach to reduce the computational cost and memory footprint of deep neural networks. We evaluated our new compression method on five different state-of-the-art image classification and object detection networks. In classification networks, we pruned about 95% of network parameters. In advanced detection networks such as YOLOv3, our proposed compression method managed to reduce the model parameters up to 59.70% which yielded 110× less memory without sacrificing much in accuracy.

A Framework for Pedestrian Attribute Recognition Using Deep Learning

Applied Sciences

The pedestrian attribute recognition task is becoming more popular daily because of its significant role in surveillance scenarios. As the technological advances are significantly more than before, deep learning came to the surface of computer vision. Previous works applied deep learning in different ways to recognize pedestrian attributes. The results are satisfactory, but still, there is some scope for improvement. The transfer learning technique is becoming more popular for its extraordinary performance in reducing computation cost and scarcity of data in any task. This paper proposes a framework that can work in surveillance scenarios to recognize pedestrian attributes. The mask R-CNN object detector extracts the pedestrians. Additionally, we applied transfer learning techniques on different CNN architectures, i.e., Inception ResNet v2, Xception, ResNet 101 v2, ResNet 152 v2. The main contribution of this paper is fine-tuning the ResNet 152 v2 architecture, which is performed by...

A Survey on Deep Neural Network Compression: Challenges, Overview, and Solutions

ArXiv, 2020

Deep Neural Network (DNN) has gained unprecedented performance due to its automated feature extraction capability. This high order performance leads to significant incorporation of DNN models in different Internet of Things (IoT) applications in the past decade. However, the colossal requirement of computation, energy, and storage of DNN models make their deployment prohibitive on resource constraint IoT devices. Therefore, several compression techniques were proposed in recent years for reducing the storage and computation requirements of the DNN model. These techniques on DNN compression have utilized a different perspective for compressing DNN with minimal accuracy compromise. It encourages us to make a comprehensive overview of the DNN compression techniques. In this paper, we present a comprehensive review of existing literature on compressing DNN model that reduces both storage and computation requirements. We divide the existing approaches into five broad categories, i.e., ne...

Resource-Restricted Environments Based Memory-Efficient Compressed Convolutional Neural Network Model for Image-Level Object Classification

IEEE Access

In the past decade, Convolutional Neural Networks (CNNs) have achieved tremendous success in solving complex classification problems. CNN architectures require an excessive number of computations to achieve high accuracy. However, these models are deficient due to the heavy cost of storage and energy, which prohibits the application of CNNs to resource-constrained edge-devices. Hence, developing aggressive optimization schemes for efficient deployment of CNNs on edge devices has become the most important requirement. To find the optimal approach, we present a resource-limited environment based memoryefficient network compression model for image-level object classification. The main aim is to compress CNN architecture by achieving low computational cost and memory requirements without dropping system's accuracy. To achieve the said goal, we propose a network compression strategy, that works in a collaborative manner, where Soft Filter Pruning is first applied to reduce the computational cost of the model. In the next step, the model is divided into No-Pruning Layers (NP-Layers) and Pruning Layers (P-Layers). Incremental Quantization is applied to PLayers due to irregular weights distribution, while for NP-Layers, we propose a novel Optimized Quantization algorithm for the quantization of weights up to optimal levels obtained from the Optimizer. This scheme is designed to achieve the best trade-off between compression ratio and accuracy of the model. Our proposed system is validated for image-level object classification on LeNet-5, CIFARquick, and VGG-16 networks using MNIST, CIFAR-10, and ImageNet ILSVRC2012 datasets respectively. We have achieved high compression ratio with negligible accuracy drop, outperforming the state-of-the-art methods. INDEX TERMS Memory-efficient network compression, pruning, quantization, image-level object classification, resource-restricted edge-devices.

Automated Pruning for Deep Neural Network Compression

2018 24th International Conference on Pattern Recognition (ICPR), 2018

In this work we present a method to improve the pruning step of the current state-of-the-art methodology to compress neural networks. The novelty of the proposed pruning technique is in its differentiability, which allows pruning to be performed during the backpropagation phase of the network training. This enables an end-to-end learning and strongly reduces the training time. The technique is based on a family of differentiable pruning functions and a new regularizer specifically designed to enforce pruning. The experimental results show that the joint optimization of both the thresholds and the network weights permits to reach a higher compression rate, reducing the number of weights of the pruned network by a further 14% to 33% compared to the current state-of-the-art. Furthermore, we believe that this is the first study where the generalization capabilities in transfer learning tasks of the features extracted by a pruned network are analyzed. To achieve this goal, we show that the representations learned using the proposed pruning methodology maintain the same effectiveness and generality of those learned by the corresponding non-compressed network on a set of different recognition tasks.

Deep Model Compression and Architecture Optimization for Embedded Systems: A Survey

Journal of Signal Processing Systems, 2020

Over the past, deep neural networks have proved to be an essential element for developing intelligent solutions. They have achieved remarkable performances at a cost of deeper layers and millions of parameters. Therefore utilising these networks on limited resource platforms for smart cameras is a challenging task. In this context, models need to be (i) accelerated and (ii) memory efficient without significantly compromising on performance. Numerous works have been done to obtain smaller, faster and accurate models. This paper presents a survey of methods suitable for porting deep neural networks on resource-limited devices, especially for smart cameras. These methods can be roughly divided in two main sections. In the first part, we present compression techniques. These techniques are categorized into: knowledge distillation, pruning, quantization, hashing, reduction of numerical precision and binarization. In the second part, we focus on architecture optimization. We introduce the methods to enhance networks structures as well as neural architecture search techniques. In each of their parts, we describe different methods, and analyse them. Finally, we conclude this paper with a discussion on these methods.

EDP: An Efficient Decomposition and Pruning Scheme for Convolutional Neural Network Compression

IEEE Transactions on Neural Networks and Learning Systems, 2021

Model compression methods have become popular in recent years, which aim to alleviate the heavy load of deep neural networks (DNNs) in real-world applications. However, most of the existing compression methods have two limitations: 1) they usually adopt a cumbersome process, including pretraining, training with a sparsity constraint, pruning/decomposition, and fine-tuning. Moreover, the last three stages are usually iterated multiple times. 2) The models are pretrained under explicit sparsity or low-rank assumptions, which are difficult to guarantee wide appropriateness. In this article, we propose an efficient decomposition and pruning (EDP) scheme via constructing a compressed-aware block that can automatically minimize Manuscript

Pedestrian Attribute Recognition with Part-based CNN and Combined Feature Representations

Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2018

In video surveillance, pedestrian attributes such as gender, clothing or hair types are useful cues to identify people. The main challenge in pedestrian attribute recognition is the large variation of visual appearance and location of attributes due to different poses and camera views. In this paper, we propose a neural network combining high-level learnt Convolutional Neural Network (CNN) features and low-level handcrafted features to address the problem of highly varying viewpoints. We first extract low-level robust Local Maximal Occurrence (LOMO) features and learn a body part-specific CNN to model attribute patterns related to different body parts. For small datasets which have few data, we propose a new learning strategy, where the CNN is pre-trained in a triplet structure on a person re-identification task and then fine-tuned on attribute recognition. Finally, we fuse the two feature representations to recognise pedestrian attributes. Our approach achieves state-of-the-art results on three public pedestrian attribute datasets.

Data-Driven Compression of Convolutional Neural Networks

2019

Deploying trained convolutional neural networks (CNNs) to mobile devices is a challenging task because of the simultaneous requirements of the deployed model to be fast, lightweight and accurate. Designing and training a CNN architecture that does well on all three metrics is highly non-trivial and can be very time-consuming if done by hand. One way to solve this problem is to compress the trained CNN models before deploying to mobile devices. This work asks and answers three questions on compressing CNN models automatically: a) How to control the trade-off between speed, memory and accuracy during model compression? b) In practice, a deployed model may not see all classes and/or may not need to produce all class labels. Can this fact be used to improve the trade-off? c) How to scale the compression algorithm to execute within a reasonable amount of time for many deployments? The paper demonstrates that a model compression algorithm utilizing reinforcement learning with architecture...