dspunfold - Generates a multi-threaded MEX file from a MATLAB function - MATLAB (original) (raw)

Generates a multi-threaded MEX file from a MATLAB function

Syntax

Description

dspunfold [file](#buxgz0h-1-file) generates a multi-threaded MEX file from the entry-point MATLAB® function specified by file, using the unfolding technology. Unfolding is a technique to improve throughput through parallelization. The multi-threaded MEX file leverages the multicore CPU architecture of the host computer and can improve speed significantly. In addition to the multi-threaded MEX file, the function generates a single-threaded MEX file, a self-diagnostic analyzer function, and the corresponding help files.

dspunfold [options](#buxgz0h-1-options) [file](#buxgz0h-1-file) generates a multi-threaded MEX file from the entry-point MATLAB function specified by file, using the function arguments specified by options.

Note

This function requires a MATLAB Coder™ license.

Input Arguments

collapse all

Option Values Description Examples
-args arguments Cell array Argument types for the entry-point MATLAB function, specified as a cell array. The cell array accepts numeric elements, the coder.typeof function, and the coder.Constant function.The generated multi-threaded MEX file is specialized to the size, class, and complexity ofarguments. The number of elements in the cell array must be the same as the number of arguments that the entry-point MATLAB function expects.dspunfold fcn -args {ones(10,1), 5}dspunfold extracts the type (size, class, and complexity) information from the elements in thearguments cell array. fcn is the entry-point MATLAB function.dspunfold fcn -args {coder.typeof(ones(10,1)), coder.typeof(5)}coder.typeof is used to specify the types of thefcn arguments.dspunfold fcn -args {coder.Constant(ones(10,1)), coder.Constant(5)}dspunfold fcn -args {}By default, arguments is {}. An empty cell array {} indicates thatfcn accepts no input arguments.
-o output Character vector Name of the output multi-threaded MEX file, specified as a character vector. If no output name is specified, the name of the generated multi-threaded MEX file is inherited from the input MATLAB function with an '_mt' suffix.dspunfold also adds a platform-specific extension to this name. In addition, dspunfold generates a single-threaded MEX file with an '_st' suffix, and a test bench file with an '_analyzer' suffix. No output name specifieddspunfold fcnFiles generated: fcn_mt.mexw64,fcn_st.mexw64,fcn_analyzer.poutput name specified dspunfold fcn -o fooFiles generated: foo.mexw64,foo_st.mexw64,foo_analyzer.p
-s statelength Scalar integer greater than or equal to zeroauto State length of the algorithm in the entry-point MATLAB function, specified as a scalar integer greater than or equal to zero, or auto. By default, thestatelength is zero frames, indicating that the algorithm is stateless.If at least one entry offrameinputs is true,statelength is considered in samples. For information on frames and samples, see Sample- and Frame-Based Concepts-s auto triggers automatic state length detection. In this mode, you must provide numeric inputs to the arguments cell array. These inputs detect the state length of the algorithm. You can inputcoder.Constant but not coder.typeof. When automatic state length detection is invoked, it is recommended that you provide random inputs to the arguments array. See Automatic State Length Detection dspunfold fcn -args {randn(10,1), randn(10,1), randn(10,1)} -s 3 -f [false, false, false]State length is three frames.dspunfold fcn -args {randn(10,1), randn(10,1), randn(10,1)} -s 3 -f [true, false, false]State length is three samples. State length is considered in samples, because at least one entry of the-f option is true. dspunfold fcn -args {randn(10,1), randn(10,1), randn(10,1)} -s autoAutomatic state length detection is invoked.dspunfold fcn -args {coder.typeof (randn(10,1)), coder.typeof(randn(10,1)), coder.typeof(randn(10,1))} -s auto generates this error message: The input argument 1 is of type coder.PrimitiveType which is not supported when using -s auto
-f frameinputs scalar logicalvector of logical values Frame status of input arguments for the entry-point MATLAB function, specified as one of true orfalse. true — Input is in frames and can be subdivided into samples without changing the system behavior.false — Input cannot be subdivided into samples without changing the system behavior. For example, you cannot subdivide the coefficients of a filter without changing the characteristics of the filter.By default, frameinputs isfalse.frameinputs set to a scalar logical value sets the frame status of all the inputs simultaneously.To specify statelength in samples, set at least one entry of frameinputs totrue. If frameinputs is not specified, the unit of statelength is frames. dspunfold fcn -args {randn(10,1), randn(10,1), randn(10,1)} -s 3 -f trueAll the inputs are marked as frames. State length isthree samples. dspunfold fcn -args {randn(10,1), randn(10,1), randn(10,1)} -s 3 -f [true, false, false]State length is three samples. dspunfold fcn -args {randn(10,1), randn(10,1), randn(10,1)} -s 3The default value of frameinputs isfalse. State length is three frames.
-r repetition Positive integer Repetition factor used to generate the multi-threaded MEX file, specified as a positive integer. The default value ofrepetition is 1. See Repetition Factor. dspunfold fcn -args {randn(10,2), randn(20,2), randn(30,3)} -r 2
-t threads Positive integer Number of threads used by the multi-threaded MEX file, specified as a positive integer. The default value of threads is the number of physical CPU cores present on your machine. See Threads. dspunfold fcn -args {randn(10,1), randn(20,2), randn(30,3)} -t 4
-v verbose Scalar logical Option to show verbose output during code generation, specified astrue or false. The default istrue. dspunfold fcn -args {randn(10,1), randn(20,2), randn(30,3)} -v truedspunfold fcn -args {randn(10,1), randn(20,2), randn(30,3)} -v false

Entry-point MATLAB function from which dspunfold generates the multi-threaded MEX file. The function must support code generation.

Example: dspunfold fcn -args {randn(10,1),randn(10,2),randn(20,1)}

fcn is the entry-point MATLAB function and {randn(10,1),randn(10,2),randn(20,1)} are its input arguments.

Output Files

When you invoke dspunfold on an entry-point MATLAB function, dspunfold generates the following files.

File Value Description Examples
Multi-threaded MEX file MEX file Multi-threaded MEX file generated from the entry-point MATLAB function. The MEX file inherits the output name. If no output name is specified, the name of this file is inherited from the MATLAB function with an '_mt' suffix. A platform-specific extension is also added to the name. dspunfold fcn -o foo generatesfoo.mexw64dspunfold fcn generatesfcn_mt.mexw64
Help file for the multi-threaded MEX file MATLAB file MATLAB help file for the multi-threaded MEX file. The help file has the same name as the MEX file, but with an '.m' extension. To invoke the help file, typehelp at the MATLAB command prompt. This help file displays information on how to invoke the MEX file, its syntax, latency, and types (size, class, and complexity) of the inputs to the MEX file. In addition, the help file documents the parameters used by dspunfold —Threads, Repetition, and State length. This information is useful when you are invoking the MEX file. The syntax to invoke the MEX file should be the same as the syntax shown in the help file. help foohelp fcn_mt
Single-threaded MEX file MEX file Single-threaded MEX file generated from the entry-point MATLAB function. The MEX file inherits the output name with an '_st' suffix. If no output name is specified, the name of this file is inherited from the MATLAB function with an '_st' suffix. A platform-specific extension is also added to the name. Use this file as a benchmark to compare against the speed of the multi-threaded MEX file. dspunfold fcn -o foo generatesfoo_st.mexw64dspunfold fcn generatesfcn_st.mexw64
Help file for the single-threaded MEX file MATLAB file MATLAB help file for the single-threaded MEX file. The help file has the same name as the MEX file, but with an '.m' extension. To invoke the help file, typehelp at the MATLAB command prompt. The help file displays information on how to invoke the MEX file, its syntax, and types (size, class, and complexity) of the inputs to the MEX file. The syntax to invoke the MEX file should be the same as the syntax shown in the help file. help foo_sthelp fcn_st
Self-diagnostic analyzer function P-coded file report = function_analyzer (input 1, input 2,...input n) measures the difference in speed between the multi-threaded MEX file and the single-threaded MEX file. This file verifies that the output values match.report = function_analyzer('latency') reports the latency of the multi-threaded MEX file introduced by unfolding.report contains the following fields: Latency — The value of the latency (in frames)Speedup — The speedup difference between the multi-threaded MEX file and single-threaded MEX file. If you specified latency option, the value of this field is empty [].Pass — Logical value that shows if the outputs match between the generated multi-threaded MEX file and the single-threaded MEX file. If you specified latency option, the value of this field is empty[].The first dimension of the analyzer inputs must be a multiple of the first dimension of the corresponding inputs, given to the-args option. The other dimensions must match exactly.The analyzer inherits the output name with an'_analyzer' suffix. If no output name is specified, the name of this file is inherited from the MATLAB function with an '_analyzer' suffix. Multiple frames with different values are specified along the first dimensionExample 1: report = foo_analyzer(randn(10*2,1), randn(20*2,2), randn(30*3,3))Example 2: report = foo_analyzer([randn(10,1);randn(10,1)],[randn(20,1);randn(20,1)],[randn(30,1);randn(30,1);randn(30,1)])report = foo_analyzer('latency')
Help file for the self-diagnostic analyzer function MATLAB file Help file for the self-diagnostic analyzer function. The help file has the same name as the MEX file, but with an '.m' extension. To invoke the help file, typehelp <function_analyzer> in MATLAB. The help file for the self-diagnostic analyzer function displays information on how to invoke the analyzer function, its syntax, and types (size, class, and complexity) of the inputs to the analyzer function. The syntax to invoke the analyzer function should be the same as the syntax shown in the help file. help foo_analyzer

Limitations

General Limitations:

Analyzer Limitations:

The following limitations apply to the analyzer function generated by thedspunfold function. For more information on the analyzer function, see 'Self-Diagnostic Analyzer’ in the 'More About' section of dspunfold.

The analyzer looks for a numerical match and fails the verification, even though the generated multi-threaded MEX file is valid.

Speedup Limitations:

More About

collapse all

State length of the algorithm.

Most of the time, the state length used by dspunfold matches the state length of the algorithm in the entry-point MATLAB function. If the algorithm is simple, state length is easy to determine. For example, the state length of an FIR filter is the number of taps in the filter –1. In some scenarios, to optimize speedup, dspunfold chooses a state length that is different from the algorithm state length or the state length specified using the -s option. For example, when the state length is greater than (threads – 1) ×repetition frames, dspunfold considers the state length to be infinite. Also, multi-threading gets disabled due to performance considerations.

You can automatically detect the minimum state length for which the outputs of the multi-threaded MEX and single-threaded MEX match.

In complex algorithms, it is not easy to determine the state length analytically. In such scenarios, use the analyzer to compute the state length. When you set-s to auto, dspunfold invokes the analyzer. The analyzer computes the outputs for different state lengths and detects the minimum state length for which the outputs of the multi-threaded MEX file and single-threaded MEX file match. The analyzer uses the numeric value of the inputs given to-args. To detect the most efficient state length, provide random inputs to -args. In this mode, you cannot input coder.typeof to arguments. Due to the extra analysis this tool requires, the time to generate the MEX file increases.

When you use automatic state length detection on an algorithm with code paths that depend on the input values, use inputs that choose the code path with the longest state length. Also, the inputs must have an immediate effect on the output. If inputs choose a code path that triggers runtime errors, automatic state length detection stops, and so does the analyzer. Make sure that the MATLAB function supports code generation and does not have run-time errors for the inputs under test. Before invoking dspunfold, callcodegen on the entry-point MATLAB function. In addition, simulate the entry-point MATLAB function to make sure it has no run-time errors.

The -t option specifies the number of threads used by the multi-threaded MEX file.

Increasing this value can improve the multi-threaded MEX speedup, at the cost of a larger latency. Decreasing this value reduces the latency and potentially decreases the multi-threaded MEX speedup.

Repetition factor is the number of consecutive frames processed by each thread in one processing step.

Increasing this value reduces the overhead per frame of data, potentially improving the speedup at the cost of larger latency. Decreasing this value reduces the latency, and potentially decreases the multi-threaded MEX speedup.

The self-diagnostic analyzer function is a help tool that is generated with the MEX file. This function measures the speedup gain of the multi-threaded MEX file compared to the single-threaded MEX file. The analyzer function also verifies that the outputs of the multi-threaded MEX file and single-threaded MEX file match.

If you specify an incorrect state length value, the outputs usually do not match. To check for the numerical match between the multi-threaded MEX file and the single-threaded MEX file, provide at least two different frames for each input argument of the analyzer. The frames are appended along the first dimension. The analyzer alternates between these frames while verifying that the outputs match. Failure to provide multiple frames for each input can decrease the effectiveness of the analyzer and can lead to false positive verification results. In other words, the analyzer might produce pass = 1 results even when an incorrect state length value is specified. The analyzer alternates through a maximum of 3 × (2 ×threads × repetition) frames. If your algorithm requires more than 3 × (2 × threads ×repetition) frames to verify the results, then the analyzer cannot verify accurately.

Tips

General

When the state length is less than or equal to (threads – 1) × repetition frames:

Workflow

State Length

Automatic State Length Detection

When you set -s to auto:

Analyzer

Speedup

Algorithms

The multi-threaded MEX file buffers multiple-input signal frames into a buffer of2 × threads × repetition frames, where threads is the number of threads, and_repetition_ is the repetition factor. The MEX file processes these frames simultaneously, using multiple cores. This process introduces some deterministic latency, where latency = 2 × threads × repetition. Latency is traded off with the speedup you might gain by increasing the number of threads or the repetition factor.

Version History

Introduced in R2015b

See Also

Topics