Flydeling: Streamlined Performance Models for Hardware Acceleration of CNNs through System Identification (original) (raw)

Flydeling: Streamlined Performance Models for Hardware Acceleration of CNNs through System Identification

The introduction of deep learning algorithms, such as Convolutional Neural Networks (CNNs) in many near-sensor embedded systems, opens new challenges in terms of energy efficiency and hardware performance. An emerging solution to address these challenges is to use tailored heterogeneous hardware accelerators combining processing elements of different architectural natures such as Central Processing Unit (CPU), Graphics Processing Unit (GPU), Field Programmable Gate Array (FPGA), or Application Specific Integrated Circuit (ASIC). To progress towards heterogeneity, a great asset would be an automated design space exploration tool that chooses, for each accelerated partition of a CNN, the most appropriate architecture considering available resources. To feed such a design space exploration process, models are required that provide very fast yet precise evaluations of alternative architectures or alternative forms of CNNs. Quick configuration estimation could be achieved with few parame...