Specify Maximum Number of Threads to Run Parallel for-Loops in the Generated Code - MATLAB & Simulink (original) (raw)

Main Content

Using MATLAB® Coder™, you can specify the maximum number of threads to run parallelfor-loops in the generated C/C++ code. You can also cross-compile the code, that is, generate the code on the host hardware processor and execute it on the target hardware processor. Depending on the target hardware platform, you can specify the number of threads.

You can set the number of threads in the generated code in different ways. The table lists these options with their precedence order. The precedence determines the order of execution of these options in which the MATLAB Coder sets the number of threads. If the value of these options equals their default value, the precedence order is moved to the next option listed in the table.

If you do not set any of these options, then, by default, the generated parallel code uses the maximum number of threads available on the target hardware during run time. This use of the maximum number of threads is enabled by Open Multiprocessing (OpenMP) pragma omp_get_max_threads in the generated code.

Precedence Options to Set Number of Threads Commands to Set Number of Threads
1 Parfor-loop with number of threads specified % u specifies the maximum number of threads parfor (i = 1:10, u)
2 Configuration property (default value = 0): NumberOfCpuThreads cfg.NumberOfCpuThreads = 8;
3 Target processor properties (default value = 1): NumberOfCores, NumberofThreadsPerCore processor.NumberOfCores = 4; processor.NumberOfThreadsPerCore = 2;

Specify Number of Threads

To specify the maximum number of CPU threads, use one of these approaches:

The maximum number of CPU threads that you configure applies toparfor-loops and automatically parallelizedfor-loops.

For example, consider these MATLAB functions parforExample andautoparExample.

The parforExample function usesparfor-loop with the maximum number of threads set to 6.

function y = parforExample(n) %#codegen y = ones (1,n); parfor (i = 1:n, 6) y(i) = 1; end end

The autoparExample function uses afor-loop. EnableAutomaticParallelization

function y = autoparExample(n) %#codegen y = ones (1,n); for i = 1:n y(i) = 1; end end

Using the MATLAB functions previously specified, this table lists different examples for setting the number of threads in the parallel generated code.

Commands to Generate Code Description Generated Code
n = 1000; cfg = coder.config('lib'); cfg.NumberOfCpuThreads = 8; codegen –config cfg ... parforExample –args {n} -report Parfor-loop sets the maximum number of threads to 6. #pragma omp parallel for num_threads( 6 > omp_get_max_threads() ? omp_get_max_threads() : 6) for (i = 0; i <= ub_loop; i++) { y_data[i] = 1.0; }
n = 1000; cfg = coder.config('lib'); cfg.NumberOfCpuThreads = 8; codegen –config cfg ... autoparExample –args {n} -report Configuration property sets the maximum number of threads to 8. #pragma omp parallel for num_threads( 8 > omp_get_max_threads() ? omp_get_max_threads() : 8) for (b_i = 0; b_i < i; b_i++) { y_data[b_i] = 1.0; }
n = 1000; cfg = coder.config('lib'); codegen –config cfg ... autoparExample –args {n} -report The maximum number of threads is set toomp_get_max_threads(). #pragma omp parallel for num_threads(omp_get_max_threads()) for (b_i = 0; b_i < i; b_i++) { y_data[b_i] = 1.0; }

Create Custom Hardware Processor

To add a target processor:

  1. Create a copy of an existing target processor.
    processor = target.get('Processor', 'ARM Compatible-ARM Cortex-A');
  2. Update the number of cores, number of threads per core, and the name of the new processor.
    processor.NumberOfCores = 4;
    processor.NumberOfThreadsPerCore = 2;
    processor.Name = '4coreprocessor';
  3. Add the target.Processor object to an internal database.
  4. Select the new processor as the target processor.
    cfg = coder.config('lib');
    cfg.HardwareImplementation.ProdHWDeviceType = 'ARM Compatible->4coreprocessor';

In the MATLAB Coder app, you can choose the custom hardware processor that you have created at command line by using target.get and target.add classes.

Alternatively, you can create a target processor by using target.Processor and target.LanguageImplementation classes. For more information, see Register New Hardware Devices.

Commands to Generate Code Description Generated Code
n = 1000; cfg = coder.config('lib'); cfg.HardwareImplementation.ProdHWDeviceType ... ... = "ARM Compatible->4coreprocessor"; codegen –config cfg autoparExample –args {n} -report Target processor sets the maximum number of threads to 4. #pragma omp parallel for num_threads( 4 > omp_get_max_threads() ? omp_get_max_threads() : 4) for (b_i = 0; b_i < i; b_i++) { y_data[b_i] = 1.0; }
n = 1000; cfg = coder.config('lib'); cfg.NumberOfCpuThreads = 2; cfg.HardwareImplementation.ProdHWDeviceType ... ... = "ARM Compatible->4coreprocessor"; codegen –config cfg autoparExample –args {n} -report Configuration property sets the maximum number of threads to 2. #pragma omp parallel for num_threads( 2 > omp_get_max_threads() ? omp_get_max_threads() : 2) for (b_i = 0; b_i < i; b_i++) { y_data[b_i] = 1.0; }

See Also

parfor | coder.config | coder.MexCodeConfig | coder.CodeConfig | coder.EmbeddedCodeConfig | target

Topics