You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[MLIR][NVVM] Add support for griddepcontrol Ops (#124603)
Adds `griddepcontrol.wait` and `griddepcontrol.launch.dependents`
MLIR Ops to generate griddepcontrol instructions.
`griddepcontrol` - Allows dependent and prerequisite grids as defined by
the runtime to control execution in the following ways:
- `griddepcontrol.wait` - causes the executing thread to wait until all
prerequisite grids in flight have completed and all the memory
operations from the prerequisite grids are performed and made visible
to the current grid.
- `griddepcontrol.launch.dependents` - signals that specific dependents
the runtime system designated to react to this instruction can be
scheduled as soon as all other CTAs in the grid issue the same
instruction or have completed.
PTX Spec Reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-griddepcontrol
Causes the executing thread to wait until all prerequisite grids in flight
2524
+
have completed and all the memory operations from the prerequisite grids
2525
+
are performed and made visible to the current grid.
2526
+
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-griddepcontrol)
Signals that specific dependents the runtime system designated to react to
2536
+
this instruction can be scheduled as soon as all other CTAs in the grid
2537
+
issue the same instruction or have completed.
2538
+
[For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-griddepcontrol)
0 commit comments