Skip to content

Commit 364aa4f

Browse files
committed
Format examples: add blank line before headings
1 parent 78cba79 commit 364aa4f

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,7 @@ def AMDGPU_ExtPackedFp8Op :
106106
If the passed-in vector has fewer than four elements, or the input is scalar,
107107
the remaining values in the <4 x i8> will be filled with
108108
undefined values as needed.
109+
109110
#### Example
110111
```mlir
111112
// Extract single FP8 element to scalar f32
@@ -171,6 +172,7 @@ def AMDGPU_PackedTrunc2xFp8Op :
171172
sub-registers, and so the conversion intrinsics (which are currently the
172173
only way to work with 8-bit float types) take packed vectors of 4 8-bit
173174
values.
175+
174176
#### Example
175177
```mlir
176178
%result = amdgpu.packed_trunc_2xfp8 %src1, %src2 into %dest[word 1]
@@ -234,6 +236,7 @@ def AMDGPU_PackedStochRoundFp8Op :
234236
sub-registers, and so the conversion intrinsics (which are currently the
235237
only way to work with 8-bit float types) take packed vectors of 4 8-bit
236238
values.
239+
237240
#### Example
238241
```mlir
239242
%result = amdgpu.packed_stoch_round_fp8 %src + %stoch_seed into %dest[2]
@@ -364,6 +367,7 @@ def AMDGPU_RawBufferLoadOp :
364367
- If `boundsCheck` is false and the target chipset is RDNA, OOB_SELECT is set
365368
to 2 to disable bounds checks, otherwise it is 3
366369
- The cache coherency bits are off
370+
367371
#### Example
368372
```mlir
369373
// Load scalar f32 from 1D buffer
@@ -413,6 +417,7 @@ def AMDGPU_RawBufferStoreOp :
413417

414418
See `amdgpu.raw_buffer_load` for a description of how the underlying
415419
instruction is constructed.
420+
416421
#### Example
417422
```mlir
418423
// Store scalar f32 to 1D buffer
@@ -465,6 +470,7 @@ def AMDGPU_RawBufferAtomicCmpswapOp :
465470

466471
See `amdgpu.raw_buffer_load` for a description of how the underlying
467472
instruction is constructed.
473+
468474
#### Example
469475
```mlir
470476
// Atomic compare-swap
@@ -510,6 +516,7 @@ def AMDGPU_RawBufferAtomicFaddOp :
510516

511517
See `amdgpu.raw_buffer_load` for a description of how the underlying
512518
instruction is constructed.
519+
513520
#### Example
514521
```mlir
515522
// Atomic floating-point add
@@ -710,6 +717,7 @@ def AMDGPU_SwizzleBitModeOp : AMDGPU_Op<"swizzle_bitmode",
710717

711718
Supports arbitrary int/float/vector types, which will be repacked to i32 and
712719
one or more `rocdl.ds_swizzle` ops during lowering.
720+
713721
#### Example
714722
```mlir
715723
%result = amdgpu.swizzle_bitmode %src 1 2 4 : f32
@@ -740,6 +748,7 @@ def AMDGPU_LDSBarrierOp : AMDGPU_Op<"lds_barrier"> {
740748
(those which will implement this barrier by emitting inline assembly),
741749
use of this operation will impede the usabiliity of memory watches (including
742750
breakpoints set on variables) when debugging.
751+
743752
#### Example
744753
```mlir
745754
amdgpu.lds_barrier
@@ -782,6 +791,7 @@ def AMDGPU_SchedBarrierOp :
782791
`amdgpu.sched_barrier` serves as a barrier that could be
783792
configured to restrict movements of instructions through it as
784793
defined by sched_barrier_opts.
794+
785795
#### Example
786796
```mlir
787797
// Barrier allowing no dependent instructions
@@ -888,6 +898,7 @@ def AMDGPU_MFMAOp :
888898

889899
The negateA, negateB, and negateC flags are only supported for double-precision
890900
operations on gfx94x.
901+
891902
#### Example
892903
```mlir
893904
%result = amdgpu.mfma %a * %b + %c
@@ -935,6 +946,7 @@ def AMDGPU_WMMAOp :
935946

936947
The `clamp` flag is used to saturate the output of type T to numeric_limits<T>::max()
937948
in case of overflow.
949+
938950
#### Example
939951
```mlir
940952
%result = amdgpu.wmma %a * %b + %c
@@ -1062,6 +1074,7 @@ def AMDGPU_ScaledMFMAOp :
10621074
are omitted from this wrapper.
10631075
- The `negateA`, `negateB`, and `negateC` flags in `amdgpu.mfma` are only supported for
10641076
double-precision operations on gfx94x and so are not included here.
1077+
10651078
#### Example
10661079
```mlir
10671080
%result = amdgpu.scaled_mfma

0 commit comments

Comments
 (0)