AMD GCN Options (Using the GNU Compiler Collection (GCC)) (original) (raw)

These options are defined specifically for the AMD GCN port.

-march=gpu

-mtune=gpu

Set architecture type or tuning for gpu. Supported values for gpuare

‘gfx900’

Compile for GCN5 Vega 10 devices (gfx900).

‘gfx902’

Compile for GCN5 Vega gfx902 devices. (Experimental)

‘gfx904’

Compile for GCN5 Vega gfx904 devices. (Experimental)

‘gfx906’

Compile for GCN5 Vega 20 devices (gfx906).

‘gfx908’

Compile for CDNA1 Instinct MI100 series devices (gfx908).

‘gfx909’

Compile for GCN5 Vega gfx909 devices. (Experimental)

‘gfx90a’

Compile for CDNA2 Instinct MI200 series devices (gfx90a).

‘gfx90c’

Compile for GCN5 Vega 7 devices (gfx90c).

‘gfx9-generic’

Compile generic code for Vega devices, executable on the following subset of GFX9 devices: gfx900, gfx902, gfx904, gfx906, gfx909 and gfx90c. (Experimental)

‘gfx1030’

Compile for RDNA2 gfx1030 devices (GFX10 series).

‘gfx1031’

Compile for RDNA2 gfx1031 devices (GFX10 series). (Experimental)

‘gfx1032’

Compile for RDNA2 gfx1032 devices (GFX10 series). (Experimental)

‘gfx1033’

Compile for RDNA2 gfx1033 devices (GFX10 series). (Experimental)

‘gfx1034’

Compile for RDNA2 gfx1034 devices (GFX10 series). (Experimental)

‘gfx1035’

Compile for RDNA2 gfx1035 devices (GFX10 series). (Experimental)

‘gfx1036’

Compile for RDNA2 gfx1036 devices (GFX10 series).

‘gfx10-3-generic’

Compile generic code for GFX10-3 devices, executable on gfx1030, gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, and gfx1036. (Experimental)

‘gfx1100’

Compile for RDNA3 gfx1100 devices (GFX11 series).

‘gfx1101’

Compile for RDNA3 gfx1101 devices (GFX11 series). (Experimental)

‘gfx1102’

Compile for RDNA3 gfx1102 devices (GFX11 series). (Experimental)

‘gfx1103’

Compile for RDNA3 gfx1103 devices (GFX11 series).

‘gfx1150’

Compile for RDNA3 gfx1150 devices (GFX11 series). (Experimental)

‘gfx1151’

Compile for RDNA3 gfx1151 devices (GFX11 series). (Experimental)

‘gfx1152’

Compile for RDNA3 gfx1152 devices (GFX11 series). (Experimental)

‘gfx1153’

Compile for RDNA3 gfx1153 devices (GFX11 series). (Experimental)

‘gfx11-generic’

Compile generic code for GFX11 devices, executable on gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, and gfx1153. (Experimental)

-msram-ecc=on

-msram-ecc=off

-msram-ecc=any

Compile binaries suitable for devices with the SRAM-ECC feature enabled, disabled, or either mode. This feature can be enabled per-process on some devices. The compiled code must match the device mode. The default is ‘any’, for devices that support it.

-mstack-size=bytes

Specify how many bytes of stack space will be requested for each GPU thread (wave-front). Beware that there may be many threads and limited memory available. The size of the stack allocation may also have an impact on run-time performance. The default is 32KB when using OpenACC or OpenMP, and 1MB otherwise.

-mxnack=on

-mxnack=off

-mxnack=any

Compile binaries suitable for devices with the XNACK feature enabled, disabled, or either mode. Some devices always require XNACK and some allow the user to configure XNACK. The compiled code must match the device mode. The default is ‘-mxnack=any’ on devices that support Unified Shared Memory, and ‘-mxnack=no’ otherwise.