GPU Code Generation for Deep Learning Networks Using MATLAB Function Block - MATLAB & Simulink (original) (raw)
Main Content
With GPU Coder™, you can generate optimized code for Simulink® models containing a variety of trained deep learning networks. You can implement the deep learning functionality in Simulink by using MATLAB Function blocks or by using blocks from the Deep Neural Networks library. When implementing with MATLAB Function blocks, use thecoder.loadDeepLearningNetwork
function to load a trained deep learning network and use the object functions of the network object to obtain the desired responses. You can configure the code generator to take advantage of the NVIDIA® CUDA® deep neural network library (cuDNN) and TensorRT™ high performance inference libraries for NVIDIA GPUs. The generated code implements the deep convolutional neural network (CNN) by using the architecture, the layers, and parameters that you specify in network object.
Example: Classify Images by Using GoogLeNet
GoogLeNet has been trained on over a million images and can classify images into 1000 object categories (such as keyboard, coffee mug, pencil, and animals). The network takes an image as input, and then outputs a label for the object in the image together with the probabilities for each of the object categories. This example shows you how to perform simulation and generate CUDA code for the pretrained googlenet
deep convolutional neural network and classify an image.
- Load the pretrained GoogLeNet network. You can choose to load a different pretrained network for image classification. If you do not have the required support packages installed, install the software according to the instructions provided.
- The object
net
contains theDAGNetwork
object. Use theanalyzeNetwork
function to display an interactive visualization of the network architecture, to detect errors and issues in the network, and to display detailed information about the network layers. The layer information includes the sizes of layer activations and learnable parameters, the total number of learnable parameters, and the sizes of state parameters of recurrent layers. - The image that you want to classify must have the same size as the input size of the network. For GoogLeNet, the size of the
imageInputLayer
is 224-by-224-by-3. TheClasses
property of the outputclassificationLayer
contains the names of the classes learned by the network. View 10 random class names out of the total of 1000.
classNames = net.Layers(end).Classes;
numClasses = numel(classNames);
disp(classNames(randperm(numClasses,10)))
'speedboat'
'window screen'
'isopod'
'wooden spoon'
'lipstick'
'drake'
'hyena'
'dumbbell'
'strawberry'
'custard apple'
Create GoogLeNet Model
- Create a Simulink model and insert a MATLAB Function block from the User-Defined Functions library.
- Add an Image From File block from the Computer Vision Toolbox™ library and set the
File name
parameter topeppers.png
. - Add a Resize block from the Computer Vision Toolbox library to the model. Set the Specify parameter of the Resize block to
Number of output rows and columns
and enter[224 224]
as the value forNumber of output rows and columns. This block resizes the input image to that of the input layer of the network. - Double-click the MATLAB Function block. A default function signature appears in the MATLAB Function Block Editor.
- Define a function called
googlenet_predict
, which implements the prediction entry-point function. The function header declaresin
as an argument to thegooglenet_predict
function, withscores
andindxTop
as the return value.
function [scores,indxTop] = googlenet_predict(in) %#codegen
persistent mynet;
if isempty(mynet)
mynet = coder.loadDeepLearningNetwork('googlenet');
end
% pass in input
predict_scores = predict(mynet,in);
[scores,indx] = sort(predict_scores, 'descend');
indxTop = indx(1:5);
A persistent objectmynet
loads theDAGNetwork
object. At the first call to the entry-point function, the persistent object is constructed and set up. On subsequent calls to the function, the same object is reused to callpredict
on inputs, avoiding reconstructing and reloading the network object.
You can also use the activations (Deep Learning Toolbox) method to network activations for a specific layer. For example, the following line of code returns the network activations for the layer specified inlayerIdx
.
out = activations(mynet,in,layerIdx,'OutputAs','Channels');
You can also use the classify (Deep Learning Toolbox) method to predict class labels for the image data inin
using the trained networkmynet
.
[out,scores] = classify(mynet,in);
For LSTM networks, you can use the predictAndUpdateState (Deep Learning Toolbox) and resetState (Deep Learning Toolbox) methods. For usage notes and limitations of these method, see Supported Functions. - Open the block parameters of the MATLAB Function block. On theCode Generation tab, select
Reusable function
for Function packaging. - Connect these blocks as shown in the diagram. Save the model as
googlenetModel
.
Configure Model for GPU Acceleration
Model configuration parameters determine the acceleration method used during simulation.
- Open the Configuration Parameters dialog box. Open the Solver pane. To compile your model for acceleration and generate CUDA code, configure the model to use a fixed-step solver. This table shows the solver configuration for this example.
Parameter Setting Effect on Generated Code Type Fixed-step Maintains a constant (fixed) step size, which is required for code generation Solver discrete (no continuous states) Applies a fixed-step integration technique for computing the state derivative of the model Fixed-step size auto Simulink chooses the step size - Select the Simulation Target pane. Set theLanguage to
C++
. - Select GPU acceleration.
options specific to GPU Coder are now visible in the Simulation Target > GPU Acceleration pane. For this example, you can use the default values for these GPU-specific parameters. - On the Simulation Target pane, set the Target Library parameter in the Deep learning group to
cuDNN
.
You can also selectTensorRT
to target TensorRT high performance inference libraries for NVIDIA GPUs. - Click OK to save and close the Configuration Parameters dialog box.
You can useset_param
to configure the model parameter programmatically in the MATLAB® command Window.
set_param('googlenetModel','GPUAcceleration','on');
Build GPU Accelerated Model
- To build and simulate the GPU accelerated model, select on the tab or use the MATLAB command:
out = sim('googlenetModel');
The software first checks to see if CUDA/C++ code was previously compiled for your model. If code was created previously, the software runs the model. If code was not previously built, the software first generates and compiles the CUDA/C++ code, and then runs the model. The code generation tool places the generated code in a subfolder of the working folder calledslprj/_slprj/googlenetModel
. - Display the top five predicted labels and their associated probabilities as a histogram. Because the network classifies images into so many object categories, and many categories are similar, it is common to consider the top-five accuracy when evaluating networks. The network classifies the image as a bell pepper with a high probability.
im = imread('peppers.png');
classNamesTop = classNames(out.yout{2}.Values.Data(:,:,1))
h = figure;
h.Position(3) = 2*h.Position(3);
ax1 = subplot(1,2,1);
ax2 = subplot(1,2,2);
image(ax1,im);
barh(ax2,out.yout{1}.Values.Data(1,5:-1:1,1))
xlabel(ax2,'Probability')
yticklabels(ax2,classNamesTop(5:-1:1))
ax2.YAxisLocation = 'right';
sgtitle('Top 5 predictions using GoogLeNet')
Configure the Model for Code Generation
The model configuration parameters provide many options for the code generation and build process.
- Select the Code Generation pane. Set the System target file to
grt.tlc
.
You can also use the Embedded Coder® target fileert.tlc
or a custom system target file.
For GPU code generation, the custom target file must be based ongrt.tlc
orert.tlc
. For information on developing a custom target file, see Customize System Target Files (Simulink Coder). - Set the Language to
C++
. - Select Generate GPU code.
- Select Generate code only.
- Select the Toolchain. For Linux® platforms, select
NVIDIA CUDA | gmake (64-bit Linux)
. For Windows® systems, selectNVIDIA CUDA (w/Microsoft Visual C++ 20XX) | nmake (64-bit windows)
.
When using a custom system target file, you must set the build controls for the toolchain approach. To learn more about toolchain approach for custom targets, see Support Toolchain Approach with Custom Target (Simulink Coder). - On the Code Generation > Report pane, select Create code generation report and Open report automatically.
- On the Code Generation > Interface pane, set theTarget Library in the Deep learning group to
cuDNN
.
You can also selectTensorRT
to target TensorRT high performance inference libraries for NVIDIA GPUs. - When the Generate GPU code parameter is enabled, options specific to GPU Coder are visible in the Code Generation > GPU Code pane. For this example, you can use the default values of the GPU-specific parameters inCode Generation > GPU Code pane.
- Click OK to save and close the Configuration Parameters dialog box.
You can also useset_param
function to configure the model parameter programmatically in the MATLAB Command Window.
set_param('googlenetModel','GenerateGPUCode','CUDA');
Generate CUDA Code for the Model
- In the Simulink Editor, open the Simulink Coder app.
- Generate code.
Messages appear in the Diagnostics Viewer. The code generator produces CUDA source and header files, and an HTML code generation report. The code generator places the files in a build folder, a subfolder named googlenetModel_grt_rtw
under your current working folder.
Limitations
- Code generation for the averagePooling2dLayer (Deep Learning Toolbox) does not support non-zero padding value when targeting third-party libraries.
For Simulink models that implement deep learning functionality using MATLAB Function block, simulation errors out if the network contains an average pooling layer with non-zero padding value. In such cases, use the blocks from the Deep Neural Networks library instead of a MATLAB Function to implement the deep learning functionality. - GPU code generation for MATLAB Function blocks in Stateflow® charts is not supported.
- When GPU acceleration is enabled, the code generator does not support Import custom code for importing custom authored CUDA source files (*.cu). Instead, use coder.ceval inside the MATLAB Function block.
- The MATLAB Function block does not support all the data types from the MATLAB language. For supported data types, refer to the block documentation.
- For GPU code generation, the custom target file must be based on
grt.tlc
orert.tlc
. - For deploying the generated code, it is recommended to use the Generate an example main program option to generate the
ert_main.cu
module. This option requires the Embedded Coder license.
You can also use thert_cppclass_main.cpp
static main module provided by MathWorks®. However, the static main file must be modified such that the models class constructor points to the deep learning object. For example,
static googlenetModelModelClass::DeepLearning_googlenetModel_T
googlenetModel_DeepLearning;
static googlenetModelModelClass googlenetModel_Obj{ &googlenetModel_DeepLearning};
See Also
Functions
- open_system (Simulink) | load_system (Simulink) | save_system (Simulink) | close_system (Simulink) | bdclose (Simulink) | get_param (Simulink) | set_param (Simulink) | sim (Simulink) | slbuild (Simulink)