Benchmarking open source deep learning frameworks (original) (raw)

Abstract

Deep Learning (DL) is one of the hottest fields. To foster the growth of DL, several open source frameworks appeared providing implementations of the most common DL algorithms. These frameworks vary in the algorithms they support and in the quality of their implementations. The purpose of this work is to provide a qualitative and quantitative comparison among three such frameworks: TensorFlow, Theano and CNTK. To ensure that our study is as comprehensive as possible, we consider multiple benchmark datasets from different fields (image processing, NLP, etc.) and measure the performance of the frameworks' implementations of different DL algorithms. For most of our experiments, we find out that CNTK's implementations are superior to the other ones under consideration. 1. INTRODUCTION Deep learning (DL) is the hottest trend in machine learning (ML). Although the theoretical concepts behind DL are not new, it has enjoyed a surge of interest over the past decade due to many factors. One example is that DL approaches have significantly outperformed state-of-the-art (SOTA) approaches in many tasks across different fields such as image processing, computer vision, speech processing, natural language processing (NLP), etc. Moreover, the scientific community (from both the academia and the industry) has quickly and massively adopted DL. Open source implementations of successful DL algorithms quickly appeared on code sharing websites, and were subsequently used by many researchers in different fields. Several DL frameworks exist, such as TensorFlow, Theano, CNTK, Caffe and PyTorch, each with different features and characteristics. Furthermore, each framework utilizes different techniques to optimize its code. Although the same algorithm is implemented in different frameworks, the performance of the different implementations can vary greatly. A researcher/practitioner looking to use such an algorithm in his/her work would face a difficult choice, since the number of different implementations is high and the effort invested by the research community in scientifically comparing these implementations is limited. In this work, we aim at providing qualitative and quantitative comparisons between three popular open source DL frameworks: TensorFlow, Theano and CNTK. These frameworks support multi-core CPUs as well as multiple GPUs. All of them import cuDNN, which is a DL library from NVIDIA that supports highly tuned implementations for standard routines such as forward and backward convolution, normalization, pooling and activation layers. We compare these frameworks by training different neural network (NN) architectures on five different standard benchmark datasets for various tasks in image processing, computer vision and NLP. Despite their importance, comparative studies like ours that focus on performance issues are rare. Limited efforts have been dedicated to conducting comparative studies between SOTA DL frameworks running on different hardware platforms (CPU and GPU) to highlight the advantages and limitations for each framework for different deep NN architectures. These efforts included papers [1-9] as well as online blogs Journal homepage:

Figures (8)

Key takeaways

sparkles

CNTK outperforms TensorFlow and Theano on various benchmark datasets, particularly in image processing tasks.
This study qualitatively and quantitatively compares TensorFlow, Theano, and CNTK across multiple DL tasks.
Five datasets were used: MNIST, CIFAR-10, Penn TreeBank, IMDB, and Self-Driving Car.
CPU and GPU utilization metrics reveal significant performance differences among the frameworks.
High CPU utilization was problematic for Theano, affecting its overall performance in comparisons.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (25)

S. Bahrampour, N. Ramakrishnan, L. Schott, and M. Shah, "Comparative study of deep learning software frameworks," arXiv preprint arXiv:1511.06435, 2015.
S. Shi, et al., "Benchmarking state-of-the-art deep learning software tools," arXiv preprint arXiv:1608.07249, 2016.
R. Al-Rfou et al., "Theano: A python framework for fast computation of mathematical expressions," arXiv preprint arXiv:1605.02688, 2016.
P. Goldsborough, "A tour of tensorflow," arXiv preprint arXiv:1610.01178, 2016.
V. Kovalev et al., "Deep learning with theano, torch, caffe, tensorflow, and deeplearning4j: Which one is the best in speed and accuracy?," Pattern Recognition and Information Processing (PRIP), 2016.
F. Bastien, et al., "Theano: new features and speed improvements," arXiv preprint arXiv:1211.5590, 2012.
W. Ding, R. Wang, F. Mao, and G. Taylor, "Theano-based large-scale visual recognition with multiple gpus," arXiv preprint arXiv:1412.2302, 2014.
W. Dai and D. Berleant, "Benchmarking contemporary deep learning hardware and frameworks: A survey of qualitative metrics," International Conference on Cognitive Machine Intelligence, pp. 148-155, 2019.
C. Coleman et al., "Dawnbench: An end-to-end deep learning benchmark and competition," 31st Confer- ence on Neural Information Processing Systems, vol. 100, no. 101, 2017.
A. Shatnawi et al., "A comparative study of open source deep learning frameworks," 9th International Conference on Information and Communication Systems, pp. 72-77, 2018.
G. Al-Bdour, R. Al-Qurran, M. Al-Ayyoub, and A. Shatnawi, "A detailed comparative study of open source deep learning frameworks," arXiv preprint arXiv:1903.00102, 2019.
D. Yu et al., "An introduction to computational networks and the computational network toolkit," Microsoft, Tech. Rep. MSR-TR-2014-112, 2014.
D. Yu, K. Yao, and Y. Zhang, "The computational network toolkit [best of the web]," IEEE Signal Pro- cessing Magazine, vol. 32, no. 6, pp. 123-126, 2015.
J. Bergstra et al., "Theano: A cpu and gpu math compiler in python," Proc. 9th Python in Science Conference, vol. 1, pp. 3-10, 2010.
M. Abadi et al., "Tensorflow: A system for large-scale machine learning," Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 265-283, 2016.
F. Chollet et al., "Keras," 2015.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recogni- tion," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, pp. 1097-1105, 2012.
A. Krizhevsky and G. Hinton, "Learning multiple layers of features from tiny images," University of Toronto, Tech. Rep., 2009.
M. P. Marcus, M. A. Marcinkiewicz, and B. Santorini, "Building a large annotated corpus of english: The penn treebank," Computational linguistics, vol. 19, no. 2, pp. 313-330, 1993.
A. Maas et al., "Learning word vectors for sentiment analysis," Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011.
R. Wallace et al., "First results in robot road-following," IJCAI, pp. 1089-1095, 1985.
G. Al-Bdour, "Comparative study between deep learning frameworks using multiple benchmark datasets," Master's thesis, Jordan University of Science and Technology, 2017.
W. Zaremba, I. Sutskever, and O. Vinyals, "Recurrent neural network regularization," arXiv preprint arXiv:1409.2329, 2014.
J. Deng, et al., "Imagenet: A large-scale hierarchical image database," IEEE conference on computer vision and pattern recognition, pp. 248-255, 2009.