GhostNets on Heterogeneous Devices via Cheap Operations (original) (raw)
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., & Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org. https://www.tensorflow.org/
Cai, H., Zhu, L., & Han, S. (2019). Proxylessnas: Direct neural architecture search on target task and hardware. In ICLR.
Chen, H., Wang, Y., Xu, C., Shi, B., Xu, C., Tian, Q., & Xu, C. (2020a). Addernet: Do we really need multiplications in deep learning? In CVPR (pp. 1468–1477).
Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Xu, J., Zhang, Z., Cheng, D., Zhu, C., Cheng, T., Zhao, Q., Li, B., Lu, X., Zhu, R., Wu, Y., Dai, J., Wang, J., Shi, J., Ouyang, W., Loy, C. C., & Lin, D. (2019b). MMDetection: Open mmlab detection toolbox and benchmark. ArXiv preprint arXiv:1906.07155.
Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2016). Semantic image segmentation with deep convolutional nets and fully connected CRFs. In ICLR.
Chen, W., Xie, D., Zhang, Y., & Pu, S. (2019c). All you need is a few shifts: Designing efficient convolutional neural networks for image classification. In CVPR (pp. 7241–7250).
Chen, W., Gong, X., Liu, X., Zhang, Q., Li, Y., & Wang, Z. (2020b). Fasterseg: Searching for faster real-time semantic segmentation. In ICLR.
Chin, T. W., Ding, R., Zhang, C., & Marculescu, D. (2020). Towards efficient model compression via learned global ranking. In CVPR (pp. 1518–1528).
Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In CVPR (pp. 1251–1258).
Cubuk, E. D., Zoph, B., Shlens, J., & Le, Q. V. (2020). Randaugment: Practical automated data augmentation with a reduced search space. In CVPR Workshops (pp. 702–703).
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In CVPR (pp. 248–255). IEEE.
Denton, E. L., Zaremba, W., Bruna, J., LeCun, Y., & Fergus, R. (2014). Exploiting linear structure within convolutional networks for efficient evaluation. In NeurIPS (pp. 1269–1277).
Forrest, N. I., Song, H., Matthew, W., Khalid, A., & Dally, J. W. (2017). Squeezenet: Alexnet-level accuracy with 50\(\times \) fewer parameters and 0.5 mb model size. In ICLR.
Gholami, A., Kwon, K., Wu, B., Tai, Z., Yue, X., Jin, P., Zhao, S., & Keutzer, K. (2018). Squeezenext: Hardware-aware neural network design. In CVPR workshops (pp. 1638–1647).
Gong, X., Chang, S., Jiang, Y., & Wang, Z. (2019). Autogan: Neural architecture search for generative adversarial networks. In ICCV (pp. 3224–3234).
Gui, S., Wang, H. N., Yang, H., Yu, C., Wang, Z., & Liu, J. (2019). Model compression with adversarial robustness: A unified optimization framework. In NeurIPS (Vol. 32, pp. 1285–1296).
Guo, J., Han, K., Wang, Y., Wu, H., Chen, X., Xu, C., & Xu, C. (2021). Distilling object detectors via decoupled features. In CVPR (pp. 2154–2164).
Han, B., Yao, Q., Yu, X., Niu, G., Xu, M., Hu, W., Tsang, I., Sugiyama, M. (2018a). Co-teaching: Robust training of deep neural networks with extremely noisy labels. In NeurIPS (pp. 8535–8545).
Han, K., Guo, J., Zhang, C., & Zhu, M. (2018b). Attribute-aware attention model for fine-grained representation learning. In Proceedings of the 26th ACM international conference on Multimedia (pp. 2040–2048).
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., & Xu, C. (2020a). Ghostnet: More features from cheap operations. In CVPR (pp. 1580–1589).
Han, K., Wang, Y., Xu, Y., Xu, C., Wu, E., & Xu, C. (2020b). Training binary neural networks through learning with noisy supervision. In ICML (pp. 4017–4026).
Han, K., Wang, Y., Xu, C., Xu, C., Wu, E., & Tao, D. (2021). Learning versatile convolution filters for efficient visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2021.3114368 Article Google Scholar
Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. In NeurIPS (pp. 1135–1143).
Han, S., Mao, H., & Dally, W. J. (2016). Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In ICLR.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR (pp. 770–778).
He, Y., Zhang, X., & Sun, J. (2017). Channel pruning for accelerating very deep neural networks. In ICCV (pp. 1389–1397).
He, Y., Kang, G., Dong, X., Fu, Y., & Yang, Y. (2018a). Soft filter pruning for accelerating deep convolutional neural networks. In IJCAI (pp. 2234–2240).
He, Y., Lin, J., Liu, Z., Wang, H., Li, L. J., & Han, S. (2018b). AMC: Automl for model compression and acceleration on mobile devices. In ECCV (pp. 784–800).
He, Y., Liu, P., Wang, Z., Hu, Z., & Yang, Y. (2019). Filter pruning via geometric median for deep convolutional neural networks acceleration. In CVPR (pp. 4340–4349).
He, Y., Ding, Y., Liu, P., Zhu, L., Zhang, H., & Yang, Y. (2020). Learning filter pruning criteria for deep convolutional neural networks acceleration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2009–2018).
Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. ArXiv preprint arXiv:1503.02531.
Howard, A., Sandler, M., Chu, G., Chen, L. C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., & Vasudevan, V., et al. (2019). Searching for mobilenetv3. In ICCV (pp. 1314–1324).
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., & Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. ArXiv preprint arXiv:1704.04861.
Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In CVPR (pp. 7132–7141).
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In CVPR (pp. 4700–4708).
Huang, Z., & Wang, N. (2018). Data-driven sparse structure selection for deep neural networks. In ECCV (pp. 304–320).
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., & Bengio, Y. (2016). Binarized neural networks. In NeurIPS (pp. 4107–4115).
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML (pp. 448–456).
Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., & Kalenichenko, D. (2018). Quantization and training of neural networks for efficient integer-arithmetic-only inference. In CVPR (pp. 2704–2713).
Jeon, Y., & Kim, J. (2018). Constructing fast network through deconstruction of convolution. In NeurIPS (pp. 5951–5961).
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Citeseer: Tech. rep. Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In NeurIPS (pp. 1097–1105).
Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2017). Pruning filters for efficient convnets. In ICLR.
Liebenwein, L., Baykal, C., Lang, H., Feldman, D., & Rus, D. (2020). Provable filter pruning for efficient neural networks. In ICLR.
Lin, M., Ji, R., Wang, Y., Zhang, Y., Zhang, B., Tian, Y., & Shao, L. (2020a). Hrank: Filter pruning using high-rank feature map. In CVPR (pp. 1529–1538).
Lin, M., Ji, R., Zhang, Y., Zhang, B., Wu, Y., Tian, Y. (2020b). Channel pruning via automatic structure search. In IJCAI (pp. 673–679).
Lin, S., Ji, R., Yan, C., Zhang, B., Cao, L., Ye, Q., Huang, F., & Doermann, D. (2019). Towards optimal structured CNN pruning via generative adversarial learning. In CVPR (pp. 2790–2799).
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In ECCV (pp. 740–755). Springer.
Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017a). Feature pyramid networks for object detection. In CVPR (pp. 2117–2125).
Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017b). Focal loss for dense object detection. In ICCV (pp. 2980–2988).
Liu, C., Wang, Y., Han, K., Xu, C., & Xu, C. (2019a). Learning instance-wise sparsity for accelerating deep models. In IJCAI (pp. 3001–3007).
Liu, Z., Wu, B., Luo, W., Yang, X., Liu, W., & Cheng, K. T. (2018). Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm. In ECCV (pp. 722–737).
Liu, Z., Mu, H., Zhang, X., Guo, Z., Yang, X., Cheng, T. K. T., & Sun, J. (2019b). Metapruning: Meta learning for automatic neural network channel pruning. In ICCV (pp. 3296–3305).
Liu, Z., Sun, M., Zhou, T., Huang, G., & Darrell, T. (2019c). Rethinking the value of network pruning. In ICLR.
Luo, J. H., Wu, J., & Lin, W. (2017). Thinet: A filter level pruning method for deep neural network compression. In ICCV (pp. 5058–5066).
Ma, N., Zhang, X., Zheng, H.T., & Sun, J. (2018). Shufflenet v2: Practical guidelines for efficient CNN architecture design. In ECCV (pp. 116–131).
Molchanov, P., Mallya, A., Tyree, S., Frosio, I., & Kautz, J. (2019). Importance estimation for neural network pruning. In CVPR (pp. 11264–11272).
Ning, X., Zhao, T., Li, W., Lei, P., Wang, Y., & Yang, H. (2020). DSA: More efficient budgeted pruning via differentiable sparsity allocation. In ECCV (pp. 592–607).
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., & Antiga, L., & Desmaison, A. (2019). Pytorch: An imperative style, high-performance deep learning library. In NeurIPS (Vol. 32, pp. 8026–8037).
Radosavovic, I., Kosaraju, R. P., Girshick, R., He, K., & Dollár, P. (2020). Designing network design spaces. In CVPR (pp. 10428–10436).
Rastegari, M., Ordonez, V., Redmon, J., & Farhadi, A. (2016). Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV (pp. 525–542). Springer.
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. In NeurIPS (Vol. 28, pp. 91–99).
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR (pp. 4510–4520).
Shen, M., Han, K., Xu, C., & Wang, Y. (2019). Searching for accurate binary neural architectures. In ICCV workshops.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In CVPR (pp. 1–9).
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In CVPR (pp. 2818–2826).
Tan, M., & Le, Q. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. In ICML (pp. 6105–6114).
Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., & Le, Q. V. (2019). Mnasnet: Platform-aware neural architecture search for mobile. In CVPR (pp. 2820–2828).
Wan, A., Dai, X., Zhang, P., He, Z., Tian, Y., Xie, S., Wu, B., Yu, M., Xu, T., & Chen, K., et al. (2020). Fbnetv2: Differentiable neural architecture search for spatial and channel dimensions. In CVPR (pp. 12965–12974).
Wang, Y., Xu, C., You, S., Tao, D., & Xu, C. (2016). CNNpack: packing convolutional neural networks in the frequency domain. In NeurIPS (pp. 253–261).
Wang, Y., Xu, C., XU, C., Xu, C., & Tao, D. (2018). Learning versatile filters for efficient convolutional neural networks. In NeurIPS (Vol. 31, pp. 1608–1618).
Wang, Y., Jiang, Z., Chen, X., Xu, P., Zhao, Y., Lin, Y., & Wang, Z. (2019). E2-train: Training state-of-the-art CNNs with over 80% energy savings. In NeurIPS (Vol. 32, pp. 5138–5150).
Wen, W., Wu, C., Wang, Y., Chen, Y., & Li, H. (2016). Learning structured sparsity in deep neural networks. In NeurIPS (pp. 2074–2082).
Williams, S., Waterman, A., & Patterson, D. (2009). Roofline: An insightful visual performance model for multicore architectures. Communications of the ACM,52(4), 65–76. Article Google Scholar
Wilson, R. C., Hancock, E. R., & Smith, W. A. P. (2016). Wide residual networks. In BMVC.
Wu, B., Wan, A., Yue, X., Jin, P., Zhao, S., Golmant, N., Gholaminejad, A., Gonzalez, J., & Keutzer, K. (2018). Shift: A zero flop, zero parameter alternative to spatial convolutions. In CVPR (pp. 9127–9135).
Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. In CVPR (pp. 1492–1500).
Xu, Y., Wang, Y., Chen, H., Han, K., Chunjing, X., Tao, D., & Xu, C. (2019). Positive-unlabeled compression on the cloud. In NeurIPS (Vol. 32, pp. 2565–2574).
Yang, L., Jiang, H., Cai, R., Wang, Y., Song, S., Huang, G., & Tian, Q. (2021). Condensenet v2: Sparse feature reactivation for deep networks. In CVPR (pp. 3569–3578).
Yang, Z., Wang, Y., Liu, C., Chen, H., Xu, C., Shi, B., Xu, C., & Xu, C. (2019). Legonet: Efficient convolutional neural networks with lego filters. In ICML (pp. 7005–7014).
Yang, Z., Wang, Y., Chen, X., Shi, B., Xu, C., Xu, C., Tian, Q., & Xu, C. (2020a). Cars: Continuous evolution for efficient neural architecture search. In CVPR (pp. 1829–1838).
Yang, Z., Wang, Y., Han, K., Xu, C., Xu, C., Tao, D., & Xu, C. (2020b). Searching for low-bit weights in quantized neural networks. In NeurIPS (Vol. 33, pp. 4091–4102).
You, S., Xu, C., Xu, C., & Tao, D. (2017). Learning from multiple teacher networks. In SIGKDD (pp. 1285–1294).
Yu, R., Li, A., Chen, C. F., Lai, J. H., Morariu, V. I., Han, X., Gao, M., Lin, C. Y., Davis, L. S. (2018). Nisp: Pruning networks using neuron importance score propagation. In CVPR (pp. 9194–9203).
Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). Shufflenet: An extremely efficient convolutional neural network for mobile devices. In CVPR (pp. 6848–6856).
Zhou, D., Hou, Q., Chen, Y., Feng, J., & Yan, S. (2020). Rethinking bottleneck structure for efficient mobile network design. In ECCV. Springer (pp. 680–697).
Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2018). Learning transferable architectures for scalable image recognition. In CVPR (pp. 8697–8710).