Advanced API Performance: Intrinsics (original) (raw)

Intrinsics can be thought of as higher-level abstractions of specific hardware instructions. They offer direct access to low-level operations or hardware-specific features, enabling increased performance. In this way, operations can be performed across threads within a warp, also known as a wavefront.

The following code example is an example with SM6:

float(4) NvShflXor (float(4) input, uint LaneMask) { float(4) output = WaveReadLaneAt(input, WaveGetLaneIndex() ^ LaneMask); return output; }

About the Authors