MLIR: CopyBuilder Struct Reference (original) (raw)
Helper to create the tma operations corresponding to linalg::CopyOp
. More...
Inheritance diagram for CopyBuilder:
Public Member Functions | |
---|---|
CopyBuilder (RewriterBase &rewriter, Location loc) | |
SmallVector< Operation * > | rewrite (ArrayRef< Operation * > copyOps) |
![]() |
|
HopperBuilder (RewriterBase &rewriter, Location loc) | |
TypedValue< nvgpu::MBarrierGroupType > | buildAndInitBarrierInSharedMemory (OpFoldResult numThreads) |
TypedValue< nvgpu::TensorMapDescriptorType > | buildGlobalMemRefDescriptor (TypedValue< MemRefType > memref, gpu::LaunchOp launchOp) |
Create tma descriptor op to initiate transfer from global to shared memory. More... | |
OpFoldResult | buildTmaAsyncLoad (TypedValue< nvgpu::TensorMapDescriptorType > globalDesc, TypedValue< MemRefType > sharedMemref, TypedValue< nvgpu::MBarrierGroupType > barrier, SmallVectorImpl< Operation * > &loadOps) |
Build a tma load from global memory to shared memory using barrier to synchronize. More... | |
void | buildBarrierArriveTx (TypedValue< nvgpu::MBarrierGroupType > barrier, ArrayRef< OpFoldResult > sizes) |
SmallVector< Operation * > | buildPredicateLoadsOnThread0 (ArrayRef< TypedValue< nvgpu::TensorMapDescriptorType >> globalDescriptors, ArrayRef< TypedValue< MemRefType >> sharedMemBuffers, TypedValue< nvgpu::MBarrierGroupType > barrier) |
If threadIdx.x == 0 does TMA request + wait, else just wait. More... | |
void | buildTryWaitParity (TypedValue< nvgpu::MBarrierGroupType > barrier) |
Additional Inherited Members | |
---|---|
![]() |
|
RewriterBase & | rewriter |
Location | loc |
Detailed Description
Constructor & Destructor Documentation
Member Function Documentation
◆ rewrite()
The documentation for this struct was generated from the following file:
- lib/Dialect/NVGPU/TransformOps/NVGPUTransformOps.cpp