Reduction Operations Supported for Automatic Parallelization of for-loops - MATLAB & Simulink (original) (raw)

The code generator automatically parallelizes for-loops by converting implicit and explicit sequential for-loop code blocks into parallelized code blocks. Parallelization of a section of code might significantly improve the execution speed of the generated code. See How parfor-Loops Improve Execution Speed.

Parallelize `for`-loops Performing Reduction Operations

You can parallelize for-loops performing reduction operations by using the configuration option Optimize reductions.

To enable automatic parallelization of these for-loops:

Open the MATLAB® Coder™ app.
On the Generate Code page, click More Settings.
On the Speed tab, select the Enable automatic parallelization and Optimize reductions check boxes.

Optimize reductions is also enabled if you set theLeverage target hardware instruction set extensions parameter to an instruction set that your processor supports.

To enable the configuration option OptimizeReductions by using the command-line interface, run these commands.

cfg = coder.config('lib'); cfg.EnableAutoParallelization = true; cfg.OptimizeReductions = true;

For example, write a MATLAB function arraySum that calculates the sum of elements of arrays in1 and sum, and returns the reduction variableout.

function out = arraySum(in1,a,b) sum = 0; c = zeros(numel(in1),1); for i2 = 1:numel(in1) if i2 > in1(i2) sum = sum + in1(i2); c(i2) = a(i2) + b(i2); end end out = sum + mean(c); end

At the MATLAB command line, run this codegen command.

arr = 1:1000; codegen arraySum -config cfg -args {arr,arr,arr} -report

Code generation successful: View report

Open the code generation report to see the parallelizedfor-loop that performs the addition operation.

sum = 0.0; #pragma omp parallel num_threads(omp_get_max_threads()) private(sumPrime, d) { sumPrime = 0.0; #pragma omp for nowait for (i2 = 0; i2 < 1000; i2++) { c[i2] = 0.0; d = in1[i2]; if ((double)i2 + 1.0 > d) { sumPrime += d; c[i2] = a[i2] + b[i2]; } } omp_set_nest_lock(&autoparExample_nestLockGlobal); {

  sum += sumPrime;
}
omp_unset_nest_lock(&autoparExample_nestLockGlobal);

}

MATLAB Functions Supported for Reduction Operations

A reduction operation reduces specific dimensions of an input to a scalar value. A reduction operation must be associative and commutative. This table lists the MATLAB functions that are supported as reduction operations and are parallelized in generated code, where X is the reduction variable and expr is a MATLAB expression. The reduction variable X can appear on both sides of an assignment statement.

MATLAB Function	Usage Notes
plus	For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = X + expr
minus	For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = X - expr
times	For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = X .* expr
max	Example: X = max(X,expr)
min	Example: X = min(X,expr)
sum	For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = sum(X)
prod	For integer data types, the Saturate on integer overflow (SaturateOnIntegerOverflow) property must be disabled.Example: X = prod(X)
or	Example: X = X \| expr
and	Example: X = X & expr
bitand	Example: X = bitand(X,expr)
bitor	Example: X = bitor(X,expr)
bitxor	Example: X = bitxor(X,expr)

Note

The Support nonfinite numbers (SupportNonFinite) property supports code generation only for standalone libraries (lib, dll) and executables.

The following example shows a typical usage of a reduction variableX.

X = 0; % Initialize X for i = 1:n X = X + d(i); end

This loop is equivalent to the following, where you calculate eachd(i) in a different iteration.

X = X + d(1) + ... + d(n)

Handling Overflow in Automatic Parallelization of `for`-loops

Enabling automatic parallelization of for-loops and reduction optimization might produce different results due to overflow when you compare the output of sequential MATLAB code with that of the generated parallel C/C++ code. Therefore, when there is possibility of such overflow, the code generator does not parallelize the loop.

The table shows the MATLAB functions where significant overflow can occur, along with their corresponding workarounds.

MATLAB Function	Description	Workaround
Integer overflowfunction out = integerOverflow(in) out = int8(0); for i = 1:numel(in) out = out + in(i); end endintegerOverflow(int8(1:100))ans = int8 127	Automatic parallelization of reduction based for-loops performing arithmetic operations on integers is not supported when SaturateOnIntegerOverflow parameter is enabled.During parallel execution, the reduction operations are distributed among multiple threads. When the partial results are accumulated at the end, the results might be non-deterministic. Therefore, the code generator do not automatically parallelize the for-loop. For example,(126-125) + 122 = 1 + 122 = 123 (126 + 122) - 125 = 127(saturation) - 125 = 2	If appropriate for your application, disable the Saturate on integer overflow (SaturateOnIntegerOverflow) property to automatically parallelize for-loops.

Usage Notes and Limitations

for-loops containing calls to C/C++ functions usingcoder.ceval are not automatically parallelized.
Bitwise reduction operations (bitand, bitor, and bitxor) are only supported for integer data types.
Custom reduction operations such as a = foo(a,b) are not supported for automatic parallelization offor-loops.
Reduction operations on floating-point numbers are only approximately associative. To get deterministic behavior of a parallel execution, the reduction operations involved must be associative. To be associative, a function f must satisfy the following for alla, b, andc.
f(a,f(b,c)) = f(f(a,b),c)
When working with floating-point numbers, different parallel executions of a loop might produce results with different round-off errors. If such round-off errors are unacceptable to your application, use the pragmacoder.loop.parallelize('never') to instruct the code generator to not automatically parallelize specific for-loops. For more information on potential differences during code generation, seeDifferences Between Generated Code and MATLAB Code.

Reduction Operations Supported for Automatic Parallelization of for-loops - MATLAB & Simulink (original) (raw)

Parallelize for-loops Performing Reduction Operations

MATLAB Functions Supported for Reduction Operations

Handling Overflow in Automatic Parallelization of for-loops

Usage Notes and Limitations

Related Topics

Parallelize `for`-loops Performing Reduction Operations

Handling Overflow in Automatic Parallelization of `for`-loops