Benchmarking open source deep learning frameworks (original) (raw)

Deep Learning (DL) is one of the hottest fields. To foster the growth of DL, several open source frameworks appeared providing implementations of the most common DL algorithms. These frameworks vary in the algorithms they support and in the quality of their implementations. The purpose of this work is to provide a qualitative and quantitative comparison among three such frameworks: TensorFlow, Theano and CNTK. To ensure that our study is as comprehensive as possible, we consider multiple benchmark datasets from different fields (image processing, NLP, etc.) and measure the performance of the frameworks' implementations of different DL algorithms. For most of our experiments, we find out that CNTK's implementations are superior to the other ones under consideration. 1. INTRODUCTION Deep learning (DL) is the hottest trend in machine learning (ML). Although the theoretical concepts behind DL are not new, it has enjoyed a surge of interest over the past decade due to many factors. One example is that DL approaches have significantly outperformed state-of-the-art (SOTA) approaches in many tasks across different fields such as image processing, computer vision, speech processing, natural language processing (NLP), etc. Moreover, the scientific community (from both the academia and the industry) has quickly and massively adopted DL. Open source implementations of successful DL algorithms quickly appeared on code sharing websites, and were subsequently used by many researchers in different fields. Several DL frameworks exist, such as TensorFlow, Theano, CNTK, Caffe and PyTorch, each with different features and characteristics. Furthermore, each framework utilizes different techniques to optimize its code. Although the same algorithm is implemented in different frameworks, the performance of the different implementations can vary greatly. A researcher/practitioner looking to use such an algorithm in his/her work would face a difficult choice, since the number of different implementations is high and the effort invested by the research community in scientifically comparing these implementations is limited. In this work, we aim at providing qualitative and quantitative comparisons between three popular open source DL frameworks: TensorFlow, Theano and CNTK. These frameworks support multi-core CPUs as well as multiple GPUs. All of them import cuDNN, which is a DL library from NVIDIA that supports highly tuned implementations for standard routines such as forward and backward convolution, normalization, pooling and activation layers. We compare these frameworks by training different neural network (NN) architectures on five different standard benchmark datasets for various tasks in image processing, computer vision and NLP. Despite their importance, comparative studies like ours that focus on performance issues are rare. Limited efforts have been dedicated to conducting comparative studies between SOTA DL frameworks running on different hardware platforms (CPU and GPU) to highlight the advantages and limitations for each framework for different deep NN architectures. These efforts included papers [1-9] as well as online blogs Journal homepage: