1# 'gpu' Dialect 2 3Note: this dialect is more likely to change than others in the near future; use 4with caution. 5 6This dialect provides middle-level abstractions for launching GPU kernels 7following a programming model similar to that of CUDA or OpenCL. It provides 8abstractions for kernel invocations (and may eventually provide those for device 9management) that are not present at the lower level (e.g., as LLVM IR intrinsics 10for GPUs). Its goal is to abstract away device- and driver-specific 11manipulations to launch a GPU kernel and provide a simple path towards GPU 12execution from MLIR. It may be targeted, for example, by DSLs using MLIR. The 13dialect uses `gpu` as its canonical prefix. 14 15## Memory attribution 16 17Memory buffers are defined at the function level, either in "gpu.launch" or in 18"gpu.func" ops. This encoding makes it clear where the memory belongs and makes 19the lifetime of the memory visible. The memory is only accessible while the 20kernel is launched/the function is currently invoked. The latter is more strict 21than actual GPU implementations but using static memory at the function level is 22just for convenience. It is also always possible to pass pointers to the 23workgroup memory into other functions, provided they expect the correct memory 24space. 25 26The buffers are considered live throughout the execution of the GPU function 27body. The absence of memory attribution syntax means that the function does not 28require special buffers. Rationale: although the underlying models declare 29memory buffers at the module level, we chose to do it at the function level to 30provide some structuring for the lifetime of those buffers; this avoids the 31incentive to use the buffers for communicating between different kernels or 32launches of the same kernel, which should be done through function arguments 33instead; we chose not to use `alloca`-style approach that would require more 34complex lifetime analysis following the principles of MLIR that promote 35structure and representing analysis results in the IR. 36 37## Operations 38 39[include "Dialects/GPUOps.md"] 40