Permute instruction (original) (raw)
Permute (and Shuffle) instructions, part of bit manipulation as well as vector processing, copy unaltered contents from a source array to a destination array, where the indices are specified by a second source array. The size (bitwidth) of the source elements is not restricted but remains the same as the destination size. There exists two important permute variants, known as gather and scatter, respectively. The gather variant is as follows: for i = 0 to length-1 dest[i] = src[indices[i]] where the scatter variant is: for i = 0 to length-1 dest[indices[i]] = src[i]
Property | Value |
---|---|
dbo:abstract | Permute (and Shuffle) instructions, part of bit manipulation as well as vector processing, copy unaltered contents from a source array to a destination array, where the indices are specified by a second source array. The size (bitwidth) of the source elements is not restricted but remains the same as the destination size. There exists two important permute variants, known as gather and scatter, respectively. The gather variant is as follows: for i = 0 to length-1 dest[i] = src[indices[i]] where the scatter variant is: for i = 0 to length-1 dest[indices[i]] = src[i] Note that unlike in memory-based gather-scatter all three of dest, src, and indices are registers (or parts of registers in the case of bit-level permute), not memory locations. The scatter variant can be seen to "scatter" the source elements across (into) to the destination, where the "gather" variant is gathering data from the indexed source elements. Given that the indices may be repeated in both variants, the resultant output is not a strict mathematical permutation because duplicates can occur in the output. A special case of permute is also used in GPU "swizzling" (again, not strictly a permutation) which performs on-the-fly reordering of subvector data so as to align or duplicate elements with the appropriate SIMD lane. (en) |
dbo:wikiPageExternalLink | https://www.felixcloutier.com/x86/vshuff32x4:vshuff64x2:vshufi32x4:vshufi64x2 https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/coding-for-neon---part-5-rearranging-vectors https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf |
dbo:wikiPageID | 67879453 (xsd:integer) |
dbo:wikiPageLength | 6219 (xsd:nonNegativeInteger) |
dbo:wikiPageRevisionID | 1099397344 (xsd:integer) |
dbo:wikiPageWikiLink | dbr:Power_ISA dbr:Scalable_Vector_Extension dbr:Bit_manipulation dbr:Permutation dbr:Vector_processors dbr:PowerPC_G4 dbr:Cryptographic dbr:SIMD dbr:GNU_Compiler_Collection dbr:Gather-scatter_(vector_addressing) dbr:Swizzling_(computer_graphics) dbr:AVX-512 dbr:AltiVec dbr:Graphics_processing_unit dbc:Binary_arithmetic dbc:Computer_arithmetic dbr:LLVM dbr:OpenCL dbr:RISC-V dbr:Scalar_processor dbr:Vector_processing dbr:SIMD_lane |
dbp:wikiPageUsesTemplate | dbt:About dbt:Code dbt:Refimprove dbt:Reflist dbt:Slink dbt:Multimedia_extensions |
dct:subject | dbc:Binary_arithmetic dbc:Computer_arithmetic |
rdfs:comment | Permute (and Shuffle) instructions, part of bit manipulation as well as vector processing, copy unaltered contents from a source array to a destination array, where the indices are specified by a second source array. The size (bitwidth) of the source elements is not restricted but remains the same as the destination size. There exists two important permute variants, known as gather and scatter, respectively. The gather variant is as follows: for i = 0 to length-1 dest[i] = src[indices[i]] where the scatter variant is: for i = 0 to length-1 dest[indices[i]] = src[i] (en) |
rdfs:label | Permute instruction (en) |
owl:sameAs | wikidata:Permute instruction https://global.dbpedia.org/id/FsJr8 |
prov:wasDerivedFrom | wikipedia-en:Permute_instruction?oldid=1099397344&ns=0 |
foaf:homepage | https://www.felixcloutier.com/x86/vshuff32x4:vshuff64x2:vshufi32x4:vshufi64x2 |
foaf:isPrimaryTopicOf | wikipedia-en:Permute_instruction |
is dbo:wikiPageWikiLink of | dbr:AoS_and_SoA dbr:PowerPC_G4 dbr:AVX-512 dbr:AltiVec dbr:Vector_processor dbr:Interleaving_(data) dbr:Single_instruction,_multiple_data |
is foaf:primaryTopic of | wikipedia-en:Permute_instruction |