Automatic Parallelization of for-Loops in the Generated Code - MATLAB & Simulink (original) (raw)
MATLAB® Coder™ automatically parallelizes for
-loops in generated C/C++ code by default, using the Open Multiprocessing (OpenMP) library. Automatic parallelization supports parallelization of explicit and implicitfor
-loops, and for
-loops performing reduction operations. For more information, see Reduction Operations Supported for Automatic Parallelization of for-loops.
To generate parallel C/C++ code, your compiler must support OpenMP library. The Enable automatic parallelization option supports all build types (MEX, DLL, LIB, and EXE) of coder.config function.
MATLAB Coder uses internal heuristics to determine whether afor
-loop should be parallelized.
Parallelization of Explicit and Implicit for
-loops
Automatic parallelization offor
-loops supports both explicit and implicitfor
-loops. You can specify the Maximum number of CPU threads to run parallelfor
-loops in the generated C/C++ code.
For more information, see Specify Maximum Number of Threads to Run Parallel for-Loops in the Generated Code.
Parallelization of explicitfor
-loops
Explicit for
-loops are for
-loops that are present in your MATLAB code. Elementwise operations on an array benefit from automatic parallelization. The table shows a MATLAB function with an explicit for
-loop and the C code generated using automatic parallelization. To generate code, save the MATLAB function as explicitLoop.m
in the current working directory and run the codegen
command.
MATLAB Code | Generated C Code |
---|---|
% MATLAB code function out = explicitLoop(a, b) out = zeros(size(a)); for i = 1:numel(a) if a(i) > 1000 out(i) = a(i) - b(i); else out(i) = a(i) + b(i); end end end % C code generation command >> codegen explicitLoop -args {1:10000, 1:10000} -config:lib -report | #pragma omp parallel for num_threads(omp_get_max_threads()) private(d) for (i = 0; i < 10000; i++) { d = a[i]; if (d > 1000.0) { out[i] = d - b[i]; } else { out[i] = d + b[i]; } } |
Parallelization of implicitfor
-loops
Implicit for
-loops are the loops that are not written in the MATLAB code, but are MATLAB operations that are translated to a for
-loop in the generated C/C++ code. The table shows a MATLAB function with an implicit for
-loop and the C code generated using automatic parallelization. To generate code, save the MATLAB code as implicitLoop.m
in the current working directory and run the codegen
command.
MATLAB Code | Generated C Code |
---|---|
% MATLAB code function [y]= implicitLoop(in) a = ones(10000,1) + in; y = [a a]; end % C code generation command >> codegen implicitLoop -args {100} -config:lib -report | #pragma omp parallel for num_threads(omp_get_max_threads()) for (i = 0; i < 10000; i++) { y[i] = in + 1.0; y[i + 10000] = in + 1.0; } |
Loop Versioning
In the above examples, the loop bounds are compile-time constants. When the loop bounds are not known at compile time, the code generator generates both serial and parallel versions of the for
-loop. Depending on the number of loop iterations at run-time, the more efficient version of the loop is executed.
The table shows a MATLAB function loopVersion
and the generated C code containing both serial and parallel versions of thefor
-loop.
MATLAB Code | Generated C Code |
---|---|
% MATLAB code function y = loopVersion(A, n) y = zeros(size(A)); for i = 1:n y(i) = sin(A(i)); end end % C code generation command >> codegen loopVersion -args {1:10000, 10000} -config:lib -report | if ((int)n < 800) { for (b_i = 0; b_i < i; b_i++) { y[b_i] = sin(A[b_i]); } } else { #pragma omp parallel for num_threads(omp_get_max_threads()) for (b_i = 0; b_i < i; b_i++) { y[b_i] = sin(A[b_i]); } } |
Code Generation Report and Code Insights
To view the generated C/C++ code for the above MATLAB function explicitLoop
, open the code generation report. In the Code pane of the report, the line numbers highlighted in green next to the for
-loop show the part of the code that is parallelized.
Generated Code
In the generated code, the OpenMP pragma statement before thefor
-loop indicates the parallelization of thefor
-loop.
void explicitLoop(const double a[10000], const double b[10000], double out[10000]) { double d; int i; if (!isInitialized_explicitLoop) { explicitLoop_initialize(); } #pragma omp parallel for num_threads(omp_get_max_threads()) private(d)
for (i = 0; i < 10000; i++) { d = a[i]; if (d > 1000.0) { out[i] = d - b[i]; } else { out[i] = d + b[i]; } } }
Code Insights
In the Code Insights tab, under Automatic Parallelization, you can see detailed information about thefor
-loops that are not parallelized or versioned in the generated code.
For example, regenerate code for the explicitLoop
function defined earlier by specifying a smaller size for the input arguments.
codegen explicitLoop -args {1:100, 1:100} -config:lib -report
In this case, the for
-loop is not parallelized as there is no performance benefit in execution time. To view such code insights, open the code generation report and click Code Insights > Automatic Parallelization.
Control Parallelization of for
-loops
You can disable automatic parallelization of for
-loops if the loop performs better in serial execution.
Disable parallelization of allfor
-loops
You cannot disable the parallelization of parfor
and the loops followed by coder.loop.parallelize("loopID")
.
To disable automatic parallelization of all for
-loops:
- In the MATLAB Coder app, in the More Settings > Speed pane, and uncheck the Enable automatic parallelization setting.
- In the MATLAB Command Window, set the code configuration option
EnableAutoParallelization
tofalse
.
Disable parallelization of specificfor
-loops
To prevent parallelization of a specific for
-loop, placecoder.loop.parallelize("never")
immediately before the loop in the MATLAB code. This overrides the EnableAutoParallelization
setting.
For example, the code generator does not parallelize this loop:
coder.loop.parallelize("never"); for i = 1:n y(i) = y(i)*sin(i); end
Enable parallelization of specificfor
-loops
To parallelize specific for
-loops, placecoder.loop.parallelize("loopID")
immediately before thefor
-loop in the MATLAB code. This overrides the EnableAutoParallelization
setting.
For example, this for
-loop is always parallelized in the generated code.
coder.loop.parallelize("i"); for i = 1:100 out1(i) = out1(i)*i; end
For more information, see coder.loop.parallelize.
Usage Notes and Limitations
- In case of nested
for
-loops, MATLAB Coder parallelizes the outermostfor
-loop and vectorizes the innermostfor
-loop. for
-loops that containparfor
-loops are not parallelized.- Automatic parallelization does not support
for
-loops whose bodies contain either persistent variables or calls to functions that access persistent variables. - Automatic parallelization does not support
for
-loops in your code that contain calls to external functions. while
-loops are not parallelized.- Hardware targets with a single core or
NumberOfCpuThreads
set to1
are not automatically parallelized. - If OpenMP is not supported on target hardware or if the coder.CodeConfig object propertyEnableOpenMP is set to
false
, then nofor
-loop is parallelized. - If a single level
for
-loop can be vectorized and parallelized, then it is vectorized.
See Also
parfor | coder.loop.parallelize | coder.config | coder.MexCodeConfig | coder.CodeConfig | coder.EmbeddedCodeConfig