Reduction Operations Supported for Automatic Parallelization of for-loops - MATLAB & Simulink (original) (raw)

The code generator automatically parallelizes for-loops by converting implicit and explicit sequential for-loop code blocks into parallelized code blocks. Parallelization of a section of code might significantly improve the execution speed of the generated code. See How parfor-Loops Improve Execution Speed.

Parallelize for-loops Performing Reduction Operations

You can parallelize for-loops performing reduction operations by using the configuration option Optimize reductions.

To enable automatic parallelization of these for-loops:

  1. Open the MATLAB® Coder™ app.
  2. On the Generate Code page, click More Settings.
  3. On the Speed tab, select the Enable automatic parallelization and Optimize reductions check boxes.

Optimize reductions is also enabled if you set theLeverage target hardware instruction set extensions parameter to an instruction set that your processor supports.

To enable the configuration option OptimizeReductions by using the command-line interface, run these commands.

cfg = coder.config('lib'); cfg.EnableAutoParallelization = true; cfg.OptimizeReductions = true;

For example, write a MATLAB function arraySum that calculates the sum of elements of arrays in1 and sum, and returns the reduction variableout.

function out = arraySum(in1,a,b) sum = 0; c = zeros(numel(in1),1); for i2 = 1:numel(in1) if i2 > in1(i2) sum = sum + in1(i2); c(i2) = a(i2) + b(i2); end end out = sum + mean(c); end

At the MATLAB command line, run this codegen command.

arr = 1:1000; codegen arraySum -config cfg -args {arr,arr,arr} -report

Code generation successful: View report

Open the code generation report to see the parallelizedfor-loop that performs the addition operation.

sum = 0.0; #pragma omp parallel num_threads(omp_get_max_threads()) private(sumPrime, d) { sumPrime = 0.0; #pragma omp for nowait for (i2 = 0; i2 < 1000; i2++) { c[i2] = 0.0; d = in1[i2]; if ((double)i2 + 1.0 > d) { sumPrime += d; c[i2] = a[i2] + b[i2]; } } omp_set_nest_lock(&autoparExample_nestLockGlobal); {

  sum += sumPrime;
}
omp_unset_nest_lock(&autoparExample_nestLockGlobal);

}

MATLAB Functions Supported for Reduction Operations

A reduction operation reduces specific dimensions of an input to a scalar value. A reduction operation must be associative and commutative. This table lists the MATLAB functions that are supported as reduction operations and are parallelized in generated code, where X is the reduction variable and expr is a MATLAB expression. The reduction variable X can appear on both sides of an assignment statement.

MATLAB Function Usage Notes
plus For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = X + expr
minus For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = X - expr
times For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = X .* expr
max Example: X = max(X,expr)
min Example: X = min(X,expr)
sum For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = sum(X)
prod For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = prod(X)
or Example: X = X | expr
and Example: X = X & expr
bitand Example: X = bitand(X,expr)
bitor Example: X = bitor(X,expr)
bitxor Example: X = bitxor(X,expr)

Note

The Support nonfinite numbers (SupportNonFinite) property supports code generation only for standalone libraries (lib, dll) and executables.

The following example shows a typical usage of a reduction variableX.

X = 0; % Initialize X for i = 1:n X = X + d(i); end

This loop is equivalent to the following, where you calculate eachd(i) in a different iteration.

X = X + d(1) + ... + d(n)

Handling Overflow in Automatic Parallelization of for-loops

Enabling automatic parallelization of for-loops and reduction optimization might produce different results due to overflow when you compare the output of sequential MATLAB code with that of the generated parallel C/C++ code. Therefore, when there is possibility of such overflow, the code generator does not parallelize the loop.

The table shows the MATLAB functions where significant overflow can occur, along with their corresponding workarounds.

MATLAB Function Description Workaround
Integer overflowfunction out = integerOverflow(in) out = int8(0); for i = 1:numel(in) out = out + in(i); end endintegerOverflow(int8(1:100))ans = int8 127 Automatic parallelization of reduction based for-loops performing arithmetic operations on integers is not supported when SaturateOnIntegerOverflow parameter is enabled.During parallel execution, the reduction operations are distributed among multiple threads. When the partial results are accumulated at the end, the results might be non-deterministic. Therefore, the code generator do not automatically parallelize the for-loop. For example,(126-125) + 122 = 1 + 122 = 123 (126 + 122) - 125 = 127(saturation) - 125 = 2 If appropriate for your application, disable the Saturate on integer overflow (SaturateOnIntegerOverflow) property to automatically parallelize for-loops.

Usage Notes and Limitations