rSqrt - Compute reciprocal square-root operation and simulate with latency - Simulink (original) (raw)

Compute reciprocal square-root operation and simulate with latency

Since R2020b

Description

The rSqrt block performs the reciprocal square-root operation on the input data signal. The block has control signals that indicate whether the input and output data are valid. You can also specify the number of iterations of the algorithm and the latency strategy.

To use this block in your Simulink® model, open the HDLMathLib library by entering this command in the MATLAB® Command Window:

open_system("HDLMathLib")

You can simulate the rSqrt block with latency. For more information, see Latency Considerations.

Examples

Ports

Input

expand all

Input signal to calculate the reciprocal square root, specified as a scalar or vector.

Input control signal that indicates whether the input signal is valid, specified as a scalar.

Data Types: Boolean

Output

expand all

Output signal that is the reciprocal square root of the input signal, returned as a scalar or vector.

Output control signal that indicates whether output signal is valid, returned as a scalar.

Data Types: Boolean

Parameters

expand all

Select the architecture for rSqrt block.

Programmatic Use

Block Parameter:architecture
Type: character vector
Values:RecipSqrtNewtonSingleRate
Default:'RecipSqrtNewtonSingleRate'

Specify the number of iterations for rSqrt algorithm.

Programmatic Use

Block Parameter:numOfIterations
Type: character vector
Values: Integer values
Default: '3'

Specify whether to use minimum, maximum, custom, or zero latency. For more information, see Latency Strategy.

To use custom latency for the block, set the Latency strategy toCustom and enter the latency value in the Custom latency field.

You can also control the number of pipeline stages for the iterative algorithm. To customize the latency for iterative algorithm, set theLatency strategy to Custom(PerIteration) and enter the iterations per pipeline value in the IterationsPerPipeline field. (since R2025a)

Programmatic Use

Block Parameter:latencyMode
Type: character vector
Values: 'Max' \|'Min'	'Custom'	'Custom(PerIteration)'	'Zero'
Default: 'Max'

When you set Latency strategy toCustom, use this parameter to specify the custom latency value. The latency must be a nonnegative integer in the range [0, _L_], where L is the maximum latency value of rSqrt block. For more information, see CustomLatency.

Dependency

To use this parameter, set Latency strategy toCustom.

Programmatic Use

Block Parameter:customLatencyValue
Type: Integer
Values: 0 to Max latency
Default: 0

Since R2025a

Specify the iterations to use per each pipeline stage in the algorithm.

Dependency

To enable this parameter, set Latency strategy toCustom(PerIteration).

Programmatic Use

Block Parameter:iterationsPerPipelineValue
Type: Integer
Values: Positive integer
Default: 1

Specify the output data type. The data type can be inherited or specified directly.

Programmatic Use

Block Parameter:OutDataTypeStr
Type: character vector
Values: 'Inherit: Inherit via internal rule' \| 'Inherit: Inherit via back propagation'	'Inherit: Same as first input'	'int8'	'uint8'	int16	'uint16'	'int32'	'uint32'	'int64'	'uint64'	fixdt(1,16,0)	''
Default: 'Inherit: Inherit via internal rule'

Action	Reasons for Taking This Action	What Happens for Overflows	Example
Select this check box.	Your model has possible overflow, and you want explicit saturation protection in the generated code.	Overflows saturate to either the minimum or maximum value that the data type can represent.	The maximum value that the int8 (signed, 8-bit integer) data type can represent is 127. Any block operation result greater than this maximum value causes overflow of the 8-bit integer. With the check box selected, the block output saturates at 127. Similarly, the block output saturates at a minimum output value of -128.
Do not select this check box.	You want to optimize efficiency of your generated code.You want to avoid overspecifying how a block handles out-of-range signals. For more information, see Troubleshoot Signal Range Errors.	Overflows wrap to the value that is representable by the data type.	The maximum value that the int8 (signed, 8-bit integer) data type can represent is 127. Any block operation result greater than this maximum value causes overflow of the 8-bit integer. With the check box cleared, the software interprets the overflow-causing value as int8, which can produce an unintended result. For example, a block result of 130 (binary 1000 0010) expressed as int8, is -126.

When you select this check box, saturation applies to every internal operation on the block, not just the output or result. Usually, the code generation process can detect when overflow is not possible. In this case, the code generator does not produce saturation code.

Programmatic Use

Block Parameter:SaturateOnIntegerOverflow
Type: character vector
Value: 'off' \|'on'
Default: 'off'

Specify the rounding mode for fixed-point operations. For more information, see Rounding Modes.

Programmatic Use

Block Parameter: RndMeth
Type: character vector
Values: 'Ceiling' \| 'Convergent'	'Floor'	'Nearest'	'Round'	'Simplest'	'Zero'
Default: 'Floor'

Algorithms

expand all

The rSqrt block is a masked subsystem that contains theLumpLatencyMATLAB Function block. The subsystem uses this MATLAB Function block to compute the latency based on the Number of iterations. To view the function that computes the latency of the block, open the LumpLatency block in the masked subsystem. To view inside the mask, click the ⇩ icon on the block.

This table shows how the block calculates the latency based on the setting of theLatency strategy parameter:

Latency Strategy	Latency Value (L)
Max	Uses maximum latency by using the equation L = (N * 4) + 5, where N is the value of theNumber of iterations parameter.
Min	Uses minimum latency by using the equation L = 2 +ceil(((N * 4) - 1) / 3)
Custom	Specifies a custom latency value. To specify the latency, enter a value between zero and the maximum latency in the Custom latency parameter. For more information, see Custom latency.
Custom(PerIteration)	Use this setting to control the pipeline stages for the iterative algorithm.Specify the number of pipeline stages per iteration using the IterationsPerPipeline parameter. The block uses the equation L = 1 + ceil((N*4) /K), where K is the value of theIterationsPerPipeline parameter.
Zero	The latency of the block is 0.

The rSqrt block uses pipelined architectures to implement the Newton-Raphson-based reciprocal square-root algorithm. By default, the block uses the maximum latency, which depends on the Number of iterations parameter. The block performs a single iteration per pipeline stage. For example, if you set theNumber of iterations to 15, the latency of the block is 65 based on the maximum latency equation in Latency Considerations. When you increase number of iterations, latency of the block also increases.

You can customize the latency for the iterative algorithm by setting theLatency Strategy to Custom(PerIteration), which allows you to control the number of iterations per pipeline stages. For example, if you set the Number of iterations to 15 and you want the block to perform the iterations in three pipeline stages, then set theIterationsPerPipeline to 5. By using theCustom(PerIteration) latency strategy, the latency of the block reduces to 13.

Extended Capabilities

expand all

The block supports HDL code generation using HDL Coder™. HDL Coder provides additional configuration options that affect HDL implementation and synthesized logic.

HDL Architecture

Architecture	Description
Module (default)	Generate code for the subsystem and the blocks within the subsystem.
BlackBox	Generate a black box interface. The generated HDL code includes only the input/output port definitions for the subsystem. Therefore, you can use a subsystem in your model to generate an interface to existing, manually written HDL code. The black-box interface generation for subsystems is similar to the Model block interface generation without the clock signals.
No HDL	Remove the subsystem from the generated code. You can use the subsystem in simulation, however, treat it as a “no-op” in the HDL code.

HDL Block Properties

General
AdaptivePipelining	Automatic pipeline insertion based on the synthesis tool, target frequency, and multiplier word-lengths. The default is inherit. See alsoAdaptivePipelining.
BalanceDelays	Detects introduction of new delays along one path and inserts matching delays on the other paths. The default is inherit. See also BalanceDelays.
ClockRatePipelining	Insert pipeline registers at a faster clock rate instead of the slower data rate. The default is inherit. See also ClockRatePipelining.
ConstrainedOutputPipeline	Number of registers to place at the outputs by moving existing delays within your design. Distributed pipelining does not redistribute these registers. The default is0. For more details, see ConstrainedOutputPipeline.
DistributedPipelining	Pipeline register distribution, or register retiming. The default is inherit. See also DistributedPipelining.
DSPStyle	Synthesis attributes for multiplier mapping. The default is none. See also DSPStyle.
FlattenHierarchy	Remove subsystem hierarchy from generated HDL code. The default is inherit. See also FlattenHierarchy.
InputPipeline	Number of input pipeline stages to insert in the generated code. Distributed pipelining and constrained output pipelining can move these registers. The default is0. For more details, see InputPipeline.
OutputPipeline	Number of output pipeline stages to insert in the generated code. Distributed pipelining and constrained output pipelining can move these registers. The default is0. For more details, see OutputPipeline.
SharingFactor	Number of functionally equivalent resources to map to a single shared resource. The default is 0. See also Resource Sharing.
StreamingFactor	Number of parallel data paths, or vectors, that are time multiplexed to transform into serial, scalar data paths. The default is 0, which implements fully parallel data paths. See also Streaming.

Target Specification

This block cannot be the DUT, so the block property settings in the Target Specification tab are ignored.

Limitations

The block does not support vector inputs.
The block does not support bus inputs.
Cannot be used in Synchronous Subsystem.
Does not support resource sharing optimization.

Version History

Introduced in R2020b

expand all

You can control the pipeline stages for iterative algorithms by setting theLatencyStrategy parameter HDL toCustom(PerIterations), then specifying the number of pipeline stages per iteration by using the IterationsPerPipeline parameter. Use this setting to control the pipeline stages in the generated code and optimize the design for speed and resource utilization.