Nested parfor and for-Loops and Other parfor Requirements - MATLAB & Simulink (original) (raw)
Nested parfor
-Loops
You cannot use a parfor
-loop inside anotherparfor
-loop. As an example, the following nesting ofparfor
-loops is not allowed:
parfor i = 1:10 parfor j = 1:5 ... end end
Tip
You cannot nest parfor
directly within anotherparfor
-loop. A parfor
-loop can call a function that contains a parfor
-loop, but you do not get any additional parallelism.
Code Analyzer in the MATLABĀ® Editor flags the use of parfor
inside anotherparfor
-loop:
You cannot nest parfor
-loops because parallelization can be performed at only one level. Therefore, choose which loop to run in parallel, and convert the other loop to a for
-loop.
Consider the following performance issues when dealing with nested loops:
- Parallel processing incurs overhead. Generally, you should run the outer loop in parallel, because overhead only occurs once. If you run the inner loop in parallel, then each of the multiple
parfor
executions incurs an overhead. See Convert Nested for-Loops to parfor-Loops for an example how to measure parallel overhead. - Make sure that the number of iterations exceeds the number of workers. Otherwise, you do not use all available workers.
- Try to balance the
parfor
-loop iteration times.parfor
tries to compensate for some load imbalance.
Tip
Always run the outermost loop in parallel, because you reduce parallel overhead.
You can also use a function that uses parfor
and embed it in aparfor
-loop. Parallelization occurs only at the outer level. In the following example, call a function MyFun.m
inside the outer parfor
-loop. The inner parfor
-loop embedded in MyFun.m
runs sequentially, not in parallel.
parfor i = 1:10 MyFun(i) end
function MyFun(i) parfor j = 1:5 ... end end
Tip
Nested parfor
-loops generally give you no computational benefit.
Convert Nested for
-Loops to parfor
-Loops
A typical use of nested loops is to step through an array using a one-loop variable to index one dimension, and a nested-loop variable to index another dimension. The basic form is:
X = zeros(n,m); for a = 1:n for b = 1:m X(a,b) = fun(a,b) end end
The following code shows a simple example. Use tic
andtoc
to measure the computing time needed.
A = 100; tic for i = 1:100 for j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end toc
Elapsed time is 49.376732 seconds.
You can parallelize either of the nested loops, but you cannot run both in parallel. The reason is that the workers in a parallel pool cannot start or access further parallel pools.
If the loop counted by i
is converted to aparfor
-loop, then each worker in the pool executes the nested loops using the j
loop counter. The j
loops themselves cannot run as a parfor
on each worker.
Because parallel processing incurs overhead, you must choose carefully whether you want to convert either the inner or the outer for
-loop to aparfor
-loop. The following example shows how to measure the parallel overhead.
First convert only the outer for
-loop to a parfor
-loop. Usetic
and toc
to measure the computing time needed. Use ticBytes
and tocBytes
to measure how much data is transferred to and from the workers in the parallel pool.
Run the new code, and run it again. The first run is slower than subsequent runs, because the parallel pool takes some time to start and make the code available to the workers.
A = 100; tic ticBytes(gcp); parfor i = 1:100 for j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end tocBytes(gcp) toc
BytesSentToWorkers BytesReceivedFromWorkers
__________________ ________________________
1 32984 24512
2 33784 25312
3 33784 25312
4 34584 26112
Total 1.3514e+05 1.0125e+05
Elapsed time is 14.130674 seconds.
Next convert only the inner loop to aparfor
-loop. Measure the time needed and data transferred as in the previous case.
A = 100; tic ticBytes(gcp); for i = 1:100 parfor j = 1:100 a(i,j) = max(abs(eig(rand(A)))); end end tocBytes(gcp) toc
BytesSentToWorkers BytesReceivedFromWorkers
__________________ ________________________
1 1.3496e+06 5.487e+05
2 1.3496e+06 5.4858e+05
3 1.3677e+06 5.6034e+05
4 1.3476e+06 5.4717e+05
Total 5.4144e+06 2.2048e+06
Elapsed time is 48.631737 seconds.
If you convert the inner loop to aparfor
-loop, both the time and amount of data transferred are much greater than in the parallel outer loop. In this case, the elapsed time is almost the same as in the nested for
-loop example. The speedup is smaller than running the outer loop in parallel, because you have more data transfer and thus more parallel overhead. Therefore if you execute the_inner_ loop in parallel, you get no computational benefit compared to running the serial for
-loop.
If you want to reduce parallel overhead and speed up your computation, run the outer loop in parallel.
If you convert the inner loop instead, then each iteration of the outer loop initiates a separate parfor
-loop. That is, the inner loop conversion creates 100 parfor
-loops. Each of the multiple parfor
executions incurs overhead. If you want to reduce parallel overhead, you should run the outer loop in parallel instead, because overhead only occurs once.
Tip
If you want to speed up your code, always run the outer loop in parallel, because you reduce parallel overhead.
Nested for
-Loops: Requirements and Limitations
If you want to convert a nested for
-loop to aparfor
-loop, you must ensure that your loop variables are properly classified, see Troubleshoot Variables in parfor-Loops. If your code does not adhere to the guidelines and restrictions labeled asRequired, you get an error. MATLAB catches some of these errors at the time it reads the code. These errors are labeled as Required (static).
Required (static): You must define the range of a for-loop nested in aparfor-loop by constant numbers or broadcast variables. |
---|
In the following example, the code on the left does not work because you define the upper limit of the for
-loop by a function call. The code on the right provides a workaround by first defining a broadcast or constant variable outside the parfor
-loop:
Invalid | Valid |
---|---|
A = zeros(100, 200); parfor i = 1:size(A, 1) for j = 1:size(A, 2) A(i, j) = i + j; end end | A = zeros(100, 200); n = size(A, 2); parfor i = 1:size(A,1) for j = 1:n A(i, j) = i + j; end end |
Required (static): The index variable for the nested for-loop must never be explicitly assigned other than by its for statement. |
---|
Following this restriction is required. If the nested for
-loop variable is changed anywhere in a parfor
-loop other than by itsfor
statement, the region indexed by thefor
-loop variable is not guaranteed to be available at each worker.
The code on the left is not valid because it tries to modify the value of the nested for
-loop variable j
in the body of the loop. The code on the right provides a workaround by assigning the nestedfor
-loop variable to a temporary variablet
, and then updating t
.
Invalid | Valid |
---|---|
A = zeros(10); parfor i = 1:10 for j = 1:10 A(i, j) = 1; j = j+1; end end | A = zeros(10); parfor i = 1:10 for j = 1:10 A(i, j) = 1; t = j; t = t + 1; end end |
Required (static): You cannot index or subscript a nestedfor-loop variable. |
---|
Following this restriction is required. If a nested for
-loop variable is indexed, iterations are not guaranteed to be independent.
The example on the left is invalid because it attempts to index the nestedfor
-loop variable j
. The example on the right removes this indexing.
Invalid | Valid |
---|---|
A = zeros(10); parfor i = 1:10 for j = 1:10 j(1); end end | A = zeros(10); parfor i = 1:10 for j = 1:10 j; end end |
Required (static): When using the nested for-loop variable for indexing a sliced array, you must use the variable in plain form, not as part of an expression. |
---|
For example, the following code on the left does not work, but the code on the right does:
Invalid | Valid |
---|---|
A = zeros(4, 11); parfor i = 1:4 for j = 1:10 A(i, j + 1) = i + j; end end | A = zeros(4, 11); parfor i = 1:4 for j = 2:11 A(i, j) = i + j - 1; end end |
Required (static): If you use a nested for-loop to index into a sliced array, you cannot use that array elsewhere in theparfor-loop. |
---|
In the following example, the code on the left does not work becauseA
is sliced and indexed inside the nestedfor
-loop. The code on the right works becausev
is assigned to A
outside of the nested loop:
Invalid | Valid |
---|---|
A = zeros(4, 10); parfor i = 1:4 for j = 1:10 A(i, j) = i + j; end disp(A(i, j)) end | A = zeros(4, 10); parfor i = 1:4 v = zeros(1, 10); for j = 1:10 v(j) = i + j; end disp(v(j)) A(i, :) = v; end |
parfor
-Loop Limitations
Nested Functions
The body of a parfor
-loop cannot reference a nested function. However, it can call a nested function by a function handle. Try the following example. Note that A(idx) = nfcn(idx)
in theparfor
-loop does not work. You must usefeval
to invoke the fcn
handle in the parfor
-loop body.
function A = pfeg function out = nfcn(in) out = 1 + in; end
fcn = @nfcn;
parfor idx = 1:10
A(idx) = feval(fcn, idx);
end
end
pfeg Starting parallel pool (parpool) using the 'Processes' profile ... connected to 4 workers.
ans =
2 3 4 5 6 7 8 9 10 11
Tip
If you use function handles that refer to nested functions inside aparfor
-loop, then the values of externally scoped variables are not synchronized among the workers.
Nested parfor
-Loops
The body of a parfor
-loop cannot contain aparfor
-loop. For more information, see Nested parfor-Loops.
Nested spmd
Statements
The body of a parfor
-loop cannot contain anspmd
statement, and an spmd
statement cannot contain a parfor
-loop. The reason is that workers cannot start or access further parallel pools.
break
and return
Statements
The body of a parfor
-loop cannot containbreak
or return
statements. Consider parfeval or parfevalOnAll instead, because you can use cancel on them.
Global and Persistent Variables
The body of a parfor
-loop cannot containglobal
or persistent
variable declarations. The reason is that these variables are not synchronized between workers. You can use global
orpersistent
variables within functions, but their value is visible only to the worker that creates them. Instead ofglobal
variables, it is a better practice to use function arguments to share values.
To learn more about variable requirements, see Troubleshoot Variables in parfor-Loops.
Scripts
If a script introduces a variable, you cannot call this script from within aparfor
-loop or spmd
statement. The reason is that this script would cause a transparency violation. For more details, see Ensure Transparency in parfor-Loops or spmd Statements.
Anonymous Functions
You can define an anonymous function inside the body of aparfor
-loop. However, sliced output variables inside anonymous functions are not supported. You can work around this by using a temporary variable for the sliced variable, as shown in the following example.
x = 1:10; parfor i=1:10 temp = x(i); anonymousFunction = @() 2*temp; x(i) = anonymousFunction() + i; end disp(x);
For more information on sliced variables, see Sliced Variables.
inputname
Functions
Using inputname
to return the workspace variable name corresponding to an argument number is not supported insideparfor
-loops. The reason is thatparfor
workers do not have access to the workspace of the MATLAB desktop. To work around this, call inputname
before parfor
, as shown in the following example.
a = 'a'; myFunction(a)
function X = myFunction(a) name = inputname(1);
parfor i=1:2
X(i).(name) = i;
end
end
load
Functions
The syntaxes of load
that do not assign to an output structure are not supported inside parfor
-loops. Insideparfor
, always assign the output ofload
to a structure.
nargin
or nargout
Functions
The following uses are not supported inside parfor
-loops:
- Using
nargin
ornargout
without a function argument - Using
narginchk
ornargoutchk
to validate the number of input or output arguments in a call to the function that is currently executing
The reason is that workers do not have access to the workspace of the MATLAB desktop. To work around this, call these functions beforeparfor
, as shown in the following example.
myFunction('a','b')
function X = myFunction(a,b) nin = nargin; parfor i=1:2 X(i) = i*nin; end end
P-Code Scripts
You can call P-code script files from within a parfor
-loop, but P-code scripts cannot contain a parfor
-loop. To work around this, use a P-code function instead of a P-code script.
See Also
parfor | parfeval | parfevalOnAll