Automatic Parallelization of for-Loops in the Generated Code - MATLAB & Simulink (original) (raw)

MATLAB® Coder™ automatically parallelizes for-loops in generated C/C++ code by default, using the Open Multiprocessing (OpenMP) library. Automatic parallelization supports parallelization of explicit and implicitfor-loops, and for-loops performing reduction operations. For more information, see Reduction Operations Supported for Automatic Parallelization of for-loops.

To generate parallel C/C++ code, your compiler must support OpenMP library. The Enable automatic parallelization option supports all build types (MEX, DLL, LIB, and EXE) of coder.config function.

MATLAB Coder uses internal heuristics to determine whether afor-loop should be parallelized.

Parallelization of Explicit and Implicit for-loops

Automatic parallelization offor-loops supports both explicit and implicitfor-loops. You can specify the Maximum number of CPU threads to run parallelfor-loops in the generated C/C++ code.

For more information, see Specify Maximum Number of Threads to Run Parallel for-Loops in the Generated Code.

Parallelization of explicitfor-loops

Explicit for-loops are for-loops that are present in your MATLAB code. Elementwise operations on an array benefit from automatic parallelization. The table shows a MATLAB function with an explicit for-loop and the C code generated using automatic parallelization. To generate code, save the MATLAB function as explicitLoop.m in the current working directory and run the codegen command.

MATLAB Code Generated C Code
% MATLAB code function out = explicitLoop(a, b) out = zeros(size(a)); for i = 1:numel(a) if a(i) > 1000 out(i) = a(i) - b(i); else out(i) = a(i) + b(i); end end end % C code generation command >> codegen explicitLoop -args {1:10000, 1:10000} -config:lib -report #pragma omp parallel for num_threads(omp_get_max_threads()) private(d) for (i = 0; i < 10000; i++) { d = a[i]; if (d > 1000.0) { out[i] = d - b[i]; } else { out[i] = d + b[i]; } }

Parallelization of implicitfor-loops

Implicit for-loops are the loops that are not written in the MATLAB code, but are MATLAB operations that are translated to a for-loop in the generated C/C++ code. The table shows a MATLAB function with an implicit for-loop and the C code generated using automatic parallelization. To generate code, save the MATLAB code as implicitLoop.m in the current working directory and run the codegen command.

MATLAB Code Generated C Code
% MATLAB code function [y]= implicitLoop(in) a = ones(10000,1) + in; y = [a a]; end % C code generation command >> codegen implicitLoop -args {100} -config:lib -report #pragma omp parallel for num_threads(omp_get_max_threads()) for (i = 0; i < 10000; i++) { y[i] = in + 1.0; y[i + 10000] = in + 1.0; }

Loop Versioning

In the above examples, the loop bounds are compile-time constants. When the loop bounds are not known at compile time, the code generator generates both serial and parallel versions of the for-loop. Depending on the number of loop iterations at run-time, the more efficient version of the loop is executed.

The table shows a MATLAB function loopVersion and the generated C code containing both serial and parallel versions of thefor-loop.

MATLAB Code Generated C Code
% MATLAB code function y = loopVersion(A, n) y = zeros(size(A)); for i = 1:n y(i) = sin(A(i)); end end % C code generation command >> codegen loopVersion -args {1:10000, 10000} -config:lib -report if ((int)n < 800) { for (b_i = 0; b_i < i; b_i++) { y[b_i] = sin(A[b_i]); } } else { #pragma omp parallel for num_threads(omp_get_max_threads()) for (b_i = 0; b_i < i; b_i++) { y[b_i] = sin(A[b_i]); } }

Code Generation Report and Code Insights

To view the generated C/C++ code for the above MATLAB function explicitLoop, open the code generation report. In the Code pane of the report, the line numbers highlighted in green next to the for-loop show the part of the code that is parallelized.

Generated C code for the function explicitLoop. Line numbers appear green next to the parallelized for-loop.

Generated Code

In the generated code, the OpenMP pragma statement before thefor-loop indicates the parallelization of thefor-loop.

void explicitLoop(const double a[10000], const double b[10000], double out[10000]) { double d; int i; if (!isInitialized_explicitLoop) { explicitLoop_initialize(); } #pragma omp parallel for num_threads(omp_get_max_threads()) private(d)

for (i = 0; i < 10000; i++) { d = a[i]; if (d > 1000.0) { out[i] = d - b[i]; } else { out[i] = d + b[i]; } } }

Code Insights

In the Code Insights tab, under Automatic Parallelization, you can see detailed information about thefor-loops that are not parallelized or versioned in the generated code.

For example, regenerate code for the explicitLoop function defined earlier by specifying a smaller size for the input arguments.

codegen explicitLoop -args {1:100, 1:100} -config:lib -report

In this case, the for-loop is not parallelized as there is no performance benefit in execution time. To view such code insights, open the code generation report and click Code Insights > Automatic Parallelization.

Code Insights tab showing information about for-loops that are not parallelized.

Control Parallelization of for-loops

You can disable automatic parallelization of for-loops if the loop performs better in serial execution.

Disable parallelization of allfor-loops

You cannot disable the parallelization of parfor and the loops followed by coder.loop.parallelize("loopID").

To disable automatic parallelization of all for-loops:

Disable parallelization of specificfor-loops

To prevent parallelization of a specific for-loop, placecoder.loop.parallelize("never") immediately before the loop in the MATLAB code. This overrides the EnableAutoParallelization setting.

For example, the code generator does not parallelize this loop:

coder.loop.parallelize("never"); for i = 1:n y(i) = y(i)*sin(i); end

Enable parallelization of specificfor-loops

To parallelize specific for-loops, placecoder.loop.parallelize("loopID") immediately before thefor-loop in the MATLAB code. This overrides the EnableAutoParallelization setting.

For example, this for-loop is always parallelized in the generated code.

coder.loop.parallelize("i"); for i = 1:100 out1(i) = out1(i)*i; end

For more information, see coder.loop.parallelize.

Usage Notes and Limitations

See Also

parfor | coder.loop.parallelize | coder.config | coder.MexCodeConfig | coder.CodeConfig | coder.EmbeddedCodeConfig

Topics