Workflow for Generating a Multithreaded MEX File using dspunfold - MATLAB & Simulink (original) (raw)

Main Content

  1. Run the entry-point MATLABĀ® function with the inputs that you want to test. Make sure that the function has no runtime errors. Call codegen on the function and make sure that it generates a MEX file successfully.
  2. Generate the multithreaded MEX file using dspunfold. Specify a state length using the -s option. The state length must be at least the same length as the algorithm in the MATLAB function. By default, -s is set to0, indicating that the algorithm is stateless.
  3. Run the generated analyzer function. Use the pass flag to verify that the output results of the multithreaded MEX file and the single-threaded MEX file match. Also, check if the speedup and latency displayed by the analyzer function are satisfactory.
  4. If the output results do not match, increase the state length and generate the multithreaded MEX file again. Alternatively, use the automatic state length detection (specified using -s auto) to determine the minimum state length that matches the outputs.
  5. If the output results match but the speedup and latency are not satisfactory, increase the repetition factor using -r or increase the number of threads using -t. In addition, you can adjust the state length. Adjust the dspunfold options and generate new multithreaded MEX files until you are satisfied with the results..

For best practices for generating the multithreaded MEX file usingdspunfold, see the 'Tips' section of dspunfold.

Workflow Example

Run the Entry Point MATLAB Function

Create the entry-point MATLAB function.

function [y,mse] = AdaptiveFilter(x,noise)

persistent rlsf1 ffilt noise_var if isempty (rlsf1) rlsf1 = dsp.RLSFilter(32, 'ForgettingFactor', 0.98); ffilt = dsp.FIRFilter('Numerator',fir1(32, .25)); % Unknown System noise_var = 1e-4; end

d = ffilt(x) + noise_var * noise; % desired signal [y,e] = rlsf1(x, d);

mse = 10*log10(sum(e.^2)); end

The function models an RLS filter that filters the input signalx, using d as the desired signal. The function returns the filtered output in y and the filter error ine.

Run AdaptiveFilter with the inputs that you want to test. Verify that the function runs without errors.

AdaptiveFilter(randn(1000,1), randn(1000,1));

Call codegen on AdaptiveFilter and generate a MEX file.

codegen AdaptiveFilter -args {randn(1000,1), randn(1000,1)}

Generate a Multithreaded MEX File Usingdspunfold

Set the state length to 32 samples and the repetition factor to1. Provide a state length that is greater than or equal to the algorithm in the MATLAB function. When at least one entry of frameinputs is set to true, state length is considered in samples.

dspunfold AdaptiveFilter -args {randn(1000,1), randn(1000,1)} -s 32 -f true

Analyzing input MATLAB function AdaptiveFilter Creating single-threaded MEX file AdaptiveFilter_st.mexw64 Creating multi-threaded MEX file AdaptiveFilter_mt.mexw64 Creating analyzer file AdaptiveFilter_analyzer

Run the Generated Analyzer Function

The analyzer considers the actual values of the input. To increase the analyzer effectiveness, provide at least two different frames along the first dimension of the inputs.

AdaptiveFilter_analyzer(randn(10004,1),randn(10004,1))

Analyzing multi-threaded MEX file AdaptiveFilter_mt.mexw64 ... Latency = 8 frames Speedup = 3.5x Warning: The output results of the multi-threaded MEX file AdaptiveFilter_mt.mexw64 do not match the output results of the single-threaded MEX file AdaptiveFilter_st.mexw64. Check that you provided the correct state length value to the dspunfold function when you generated the multi-threaded MEX file AdaptiveFilter_mt.mexw64. For best practices and possible solutions to this problem, see the 'Tips' section in the dspunfold function reference page.

In coder.internal.warning (line 8) In AdaptiveFilter_analyzer

ans =

Latency: 8
Speedup: 3.4686
   Pass: 0

Increase the State Length

The analyzer did not pass the verification. The warning message displayed indicates that a wrong state length value is provided to thedspunfold function. Increase the state length to1000 samples and repeat the process from the previous section.

dspunfold AdaptiveFilter -args {randn(1000,1),randn(1000,1)} -s 1000 -f true

Analyzing input MATLAB function AdaptiveFilter Creating single-threaded MEX file AdaptiveFilter_st.mexw64 Creating multi-threaded MEX file AdaptiveFilter_mt.mexw64 Creating analyzer file AdaptiveFilter_analyzer

Run the generated analyzer.

AdaptiveFilter_analyzer(randn(10004,1),randn(10004,1))

Analyzing multi-threaded MEX file AdaptiveFilter_mt.mexw64 ... Latency = 8 frames Speedup = 1.8x

ans =

Latency: 8
Speedup: 1.7778
   Pass: 1

The analyzer passed verification. It is recommended that you provide different numerics to the analyzer function and make sure that the analyzer function passes.

Improve Speedup and Adjust Latency

If you want to increase speedup and your system can afford a larger latency, increase the repetition factor to 2.

dspunfold AdaptiveFilter -args {randn(1000,1),randn(1000,1)} -s 1000 -r 2 -f true

Analyzing input MATLAB function AdaptiveFilter Creating single-threaded MEX file AdaptiveFilter_st.mexw64 Creating multi-threaded MEX file AdaptiveFilter_mt.mexw64 Creating analyzer file AdaptiveFilter_analyzer

Run the analyzer.

AdaptiveFilter_analyzer(randn(10004,1), randn(10004,1))

Analyzing multi-threaded MEX file AdaptiveFilter_mt.mexw64 ... Latency = 16 frames Speedup = 2.4x

ans =

Latency: 16
Speedup: 2.3674
   Pass: 1

Repeat the process until you achieve satisfactory speedup and latency.

Use Automatic State Length Detection

Choose a state length that is greater than or equal to the state length of your algorithm. If it is not easy to determine the state length for your algorithm analytically, use the automatic state length detection tool. Invoke automatic state length detection by setting -s to auto. The tool detects the minimum state length with which the analyzer passes the verification.

dspunfold AdaptiveFilter -args {randn(1000,1),randn(1000,1)} -s auto -f true

Analyzing input MATLAB function AdaptiveFilter Creating single-threaded MEX file AdaptiveFilter_st.mexw64 Searching for minimal state length (this might take a while) Checking stateless ... Insufficient Checking 1000 ... Sufficient Checking 500 ... Insufficient Checking 750 ... Insufficient Checking 875 ... Sufficient Checking 812 ... Insufficient Checking 843 ... Sufficient Checking 827 ... Insufficient Checking 835 ... Insufficient Checking 839 ... Sufficient Checking 837 ... Sufficient Checking 836 ... Sufficient Minimal state length is 836 Creating multi-threaded MEX file AdaptiveFilter_mt.mexw64 Creating analyzer file AdaptiveFilter_analyzer

Minimal state length is 836 samples.

Run the generated analyzer.

AdaptiveFilter_analyzer(randn(10004,1), randn(10004,1))

Analyzing multi-threaded MEX file AdaptiveFilter_mt.mexw64 ... Latency = 8 frames Speedup = 1.9x

ans =

Latency: 8
Speedup: 1.9137
   Pass: 1

The analyzer passed the verification.

See Also

Functions

Topics