1# 'llvm' Dialect
2
3This dialect maps [LLVM IR](https://llvm.org/docs/LangRef.html) into MLIR by
4defining the corresponding operations and types. LLVM IR metadata is usually
5represented as MLIR attributes, which offer additional structure verification.
6
7We use "LLVM IR" to designate the
8[intermediate representation of LLVM](https://llvm.org/docs/LangRef.html) and
9"LLVM _dialect_" or "LLVM IR _dialect_" to refer to this MLIR dialect.
10
11Unless explicitly stated otherwise, the semantics of the LLVM dialect operations
12must correspond to the semantics of LLVM IR instructions and any divergence is
13considered a bug. The dialect also contains auxiliary operations that smoothen
14the differences in the IR structure, e.g., MLIR does not have `phi` operations
15and LLVM IR does not have a `constant` operation. These auxiliary operations are
16systematically prefixed with `mlir`, e.g. `llvm.mlir.constant` where `llvm.` is
17the dialect namespace prefix.
18
19[TOC]
20
21## Dependency on LLVM IR
22
23LLVM dialect is not expected to depend on any object that requires an
24`LLVMContext`, such as an LLVM IR instruction or type. Instead, MLIR provides
25thread-safe alternatives compatible with the rest of the infrastructure. The
26dialect is allowed to depend on the LLVM IR objects that don't require a
27context, such as data layout and triple description.
28
29## Module Structure
30
31IR modules use the built-in MLIR `ModuleOp` and support all its features. In
32particular, modules can be named, nested and are subject to symbol visibility.
33Modules can contain any operations, including LLVM functions and globals.
34
35### Data Layout and Triple
36
37An IR module may have an optional data layout and triple information attached
38using MLIR attributes `llvm.data_layout` and `llvm.triple`, respectively. Both
39are string attributes with the
40[same syntax](https://llvm.org/docs/LangRef.html#data-layout) as in LLVM IR and
41are verified to be correct. They can be defined as follows.
42
43```mlir
44module attributes {llvm.data_layout = "e",
45                   llvm.target_triple = "aarch64-linux-android"} {
46  // module contents
47}
48```
49
50### Functions
51
52LLVM functions are represented by a special operation, `llvm.func`, that has
53syntax similar to that of the built-in function operation but supports
54LLVM-related features such as linkage and variadic argument lists. See detailed
55description in the operation list [below](#llvmfunc-mlirllvmllvmfuncop).
56
57### PHI Nodes and Block Arguments
58
59MLIR uses block arguments instead of PHI nodes to communicate values between
60blocks. Therefore, the LLVM dialect has no operation directly equivalent to
61`phi` in LLVM IR. Instead, all terminators can pass values as successor operands
62as these values will be forwarded as block arguments when the control flow is
63transferred.
64
65For example:
66
67```mlir
68^bb1:
69  %0 = llvm.addi %arg0, %cst : i32
70  llvm.br ^bb2[%0: i32]
71
72// If the control flow comes from ^bb1, %arg1 == %0.
73^bb2(%arg1: i32)
74  // ...
75```
76
77is equivalent to LLVM IR
78
79```llvm
80%0:
81  %1 = add i32 %arg0, %cst
82  br %3
83
84%3:
85  %arg1 = phi [%1, %0], //...
86```
87
88Since there is no need to use the block identifier to differentiate the source
89of different values, the LLVM dialect supports terminators that transfer the
90control flow to the same block with different arguments. For example:
91
92```mlir
93^bb1:
94  llvm.cond_br %cond, ^bb2[%0: i32], ^bb2[%1: i32]
95
96^bb2(%arg0: i32):
97  // ...
98```
99
100### Context-Level Values
101
102Some value kinds in LLVM IR, such as constants and undefs, are uniqued in
103context and used directly in relevant operations. MLIR does not support such
104values for thread-safety and concept parsimony reasons. Instead, regular values
105are produced by dedicated operations that have the corresponding semantics:
106[`llvm.mlir.constant`](#llvmmlirconstant-mlirllvmconstantop),
107[`llvm.mlir.undef`](#llvmmlirundef-mlirllvmundefop),
108[`llvm.mlir.null`](#llvmmlirnull-mlirllvmnullop). Note how these operations are
109prefixed with `mlir.` to indicate that they don't belong to LLVM IR but are only
110necessary to model it in MLIR. The values produced by these operations are
111usable just like any other value.
112
113Examples:
114
115```mlir
116// Create an undefined value of structure type with a 32-bit integer followed
117// by a float.
118%0 = llvm.mlir.undef : !llvm.struct<(i32, f32)>
119
120// Null pointer to i8.
121%1 = llvm.mlir.null : !llvm.ptr<i8>
122
123// Null pointer to a function with signature void().
124%2 = llvm.mlir.null : !llvm.ptr<func<void ()>>
125
126// Constant 42 as i32.
127%3 = llvm.mlir.constant(42 : i32) : i32
128
129// Splat dense vector constant.
130%3 = llvm.mlir.constant(dense<1.0> : vector<4xf32>) : vector<4xf32>
131```
132
133Note that constants list the type twice. This is an artifact of the LLVM dialect
134not using built-in types, which are used for typed MLIR attributes. The syntax
135will be reevaluated after considering composite constants.
136
137### Globals
138
139Global variables are also defined using a special operation,
140[`llvm.mlir.global`](#llvmmlirglobal-mlirllvmglobalop), located at the module
141level. Globals are MLIR symbols and are identified by their name.
142
143Since functions need to be isolated-from-above, i.e. values defined outside the
144function cannot be directly used inside the function, an additional operation,
145[`llvm.mlir.addressof`](#llvmmliraddressof-mlirllvmaddressofop), is provided to
146locally define a value containing the _address_ of a global. The actual value
147can then be loaded from that pointer, or a new value can be stored into it if
148the global is not declared constant. This is similar to LLVM IR where globals
149are accessed through name and have a pointer type.
150
151### Linkage
152
153Module-level named objects in the LLVM dialect, namely functions and globals,
154have an optional _linkage_ attribute derived from LLVM IR
155[linkage types](https://llvm.org/docs/LangRef.html#linkage-types). Linkage is
156specified by the same keyword as in LLVM IR and is located between the operation
157name (`llvm.func` or `llvm.global`) and the symbol name. If no linkage keyword
158is present, `external` linkage is assumed by default. Linakge is _distinct_ from
159MLIR symbol visibility.
160
161### Attribute Pass-Through
162
163The LLVM dialect provides a mechanism to forward function-level attributes to
164LLVM IR using the `passthrough` attribute. This is an array attribute containing
165either string attributes or array attributes. In the former case, the value of
166the string is interpreted as the name of LLVM IR function attribute. In the
167latter case, the array is expected to contain exactly two string attributes, the
168first corresponding to the name of LLVM IR function attribute, and the second
169corresponding to its value. Note that even integer LLVM IR function attributes
170have their value represented in the string form.
171
172Example:
173
174```mlir
175llvm.func @func() attributes {
176  passthrough = ["noinline",           // value-less attribute
177                 ["alignstack", "4"],  // integer attribute with value
178                 ["other", "attr"]]    // attribute unknown to LLVM
179} {
180  llvm.return
181}
182```
183
184If the attribute is not known to LLVM IR, it will be attached as a string
185attribute.
186
187## Types
188
189LLVM dialect uses built-in types whenever possible and defines a set of
190complementary types, which correspond to the LLVM IR types that cannot be
191directly represented with built-in types. Similarly to other MLIR context-owned
192objects, the creation and manipulation of LLVM dialect types is thread-safe.
193
194MLIR does not support module-scoped named type declarations, e.g. `%s = type
195{i32, i32}` in LLVM IR. Instead, types must be fully specified at each use,
196except for recursive types where only the first reference to a named type needs
197to be fully specified. MLIR [type aliases](../LangRef.md/#type-aliases) can be used
198to achieve more compact syntax.
199
200The general syntax of LLVM dialect types is `!llvm.`, followed by a type kind
201identifier (e.g., `ptr` for pointer or `struct` for structure) and by an
202optional list of type parameters in angle brackets. The dialect follows MLIR
203style for types with nested angle brackets and keyword specifiers rather than
204using different bracket styles to differentiate types. Types inside the angle
205brackets may omit the `!llvm.` prefix for brevity: the parser first attempts to
206find a type (starting with `!` or a built-in type) and falls back to accepting a
207keyword. For example, `!llvm.ptr<!llvm.ptr<i32>>` and `!llvm.ptr<ptr<i32>>` are
208equivalent, with the latter being the canonical form, and denote a pointer to a
209pointer to a 32-bit integer.
210
211### Built-in Type Compatibility
212
213LLVM dialect accepts a subset of built-in types that are referred to as _LLVM
214dialect-compatible types_. The following types are compatible:
215
216-   Signless integers - `iN` (`IntegerType`).
217-   Floating point types - `bfloat`, `half`, `float`, `double` , `f80`, `f128`
218    (`FloatType`).
219-   1D vectors of signless integers or floating point types - `vector<NxT>`
220    (`VectorType`).
221
222Note that only a subset of types that can be represented by a given class is
223compatible. For example, signed and unsigned integers are not compatible. LLVM
224provides a function, `bool LLVM::isCompatibleType(Type)`, that can be used as a
225compatibility check.
226
227Each LLVM IR type corresponds to *exactly one* MLIR type, either built-in or
228LLVM dialect type. For example, because `i32` is LLVM-compatible, there is no
229`!llvm.i32` type. However, `!llvm.ptr<T>` is defined in the LLVM dialect as
230there is no corresponding built-in type.
231
232### Additional Simple Types
233
234The following non-parametric types derived from the LLVM IR are available in the
235LLVM dialect:
236
237-   `!llvm.x86_mmx` (`LLVMX86MMXType`) - value held in an MMX register on x86
238    machine.
239-   `!llvm.ppc_fp128` (`LLVMPPCFP128Type`) - 128-bit floating-point value (two
240    64 bits).
241-   `!llvm.token` (`LLVMTokenType`) - a non-inspectable value associated with an
242    operation.
243-   `!llvm.metadata` (`LLVMMetadataType`) - LLVM IR metadata, to be used only if
244    the metadata cannot be represented as structured MLIR attributes.
245-   `!llvm.void` (`LLVMVoidType`) - does not represent any value; can only
246    appear in function results.
247
248These types represent a single value (or an absence thereof in case of `void`)
249and correspond to their LLVM IR counterparts.
250
251### Additional Parametric Types
252
253These types are parameterized by the types they contain, e.g., the pointee or
254the element type, which can be either compatible built-in or LLVM dialect types.
255
256#### Pointer Types
257
258Pointer types specify an address in memory.
259
260Pointer types are parametric types parameterized by the element type and the
261address space. The address space is an integer, but this choice may be
262reconsidered if MLIR implements named address spaces. Their syntax is as
263follows:
264
265```
266  llvm-ptr-type ::= `!llvm.ptr<` type (`,` integer-literal)? `>`
267```
268
269where the optional integer literal corresponds to the memory space. Both cases
270are represented by `LLVMPointerType` internally.
271
272#### Array Types
273
274Array types represent sequences of elements in memory. Array elements can be
275addressed with a value unknown at compile time, and can be nested. Only 1D
276arrays are allowed though.
277
278Array types are parameterized by the fixed size and the element type.
279Syntactically, their representation is the following:
280
281```
282  llvm-array-type ::= `!llvm.array<` integer-literal `x` type `>`
283```
284
285and they are internally represented as `LLVMArrayType`.
286
287#### Function Types
288
289Function types represent the type of a function, i.e. its signature.
290
291Function types are parameterized by the result type, the list of argument types
292and by an optional "variadic" flag. Unlike built-in `FunctionType`, LLVM dialect
293functions (`LLVMFunctionType`) always have single result, which may be
294`!llvm.void` if the function does not return anything. The syntax is as follows:
295
296```
297  llvm-func-type ::= `!llvm.func<` type `(` type-list (`,` `...`)? `)` `>`
298```
299
300For example,
301
302```mlir
303!llvm.func<void ()>           // a function with no arguments;
304!llvm.func<i32 (f32, i32)>    // a function with two arguments and a result;
305!llvm.func<void (i32, ...)>   // a variadic function with at least one argument.
306```
307
308In the LLVM dialect, functions are not first-class objects and one cannot have a
309value of function type. Instead, one can take the address of a function and
310operate on pointers to functions.
311
312### Vector Types
313
314Vector types represent sequences of elements, typically when multiple data
315elements are processed by a single instruction (SIMD). Vectors are thought of as
316stored in registers and therefore vector elements can only be addressed through
317constant indices.
318
319Vector types are parameterized by the size, which may be either _fixed_ or a
320multiple of some fixed size in case of _scalable_ vectors, and the element type.
321Vectors cannot be nested and only 1D vectors are supported. Scalable vectors are
322still considered 1D.
323
324LLVM dialect uses built-in vector types for _fixed_-size vectors of built-in
325types, and provides additional types for fixed-sized vectors of LLVM dialect
326types (`LLVMFixedVectorType`) and scalable vectors of any types
327(`LLVMScalableVectorType`). These two additional types share the following
328syntax:
329
330```
331  llvm-vec-type ::= `!llvm.vec<` (`?` `x`)? integer-literal `x` type `>`
332```
333
334Note that the sets of element types supported by built-in and LLVM dialect
335vector types are mutually exclusive, e.g., the built-in vector type does not
336accept `!llvm.ptr<i32>` and the LLVM dialect fixed-width vector type does not
337accept `i32`.
338
339The following functions are provided to operate on any kind of the vector types
340compatible with the LLVM dialect:
341
342-   `bool LLVM::isCompatibleVectorType(Type)` - checks whether a type is a
343    vector type compatible with the LLVM dialect;
344-   `Type LLVM::getVectorElementType(Type)` - returns the element type of any
345    vector type compatible with the LLVM dialect;
346-   `llvm::ElementCount LLVM::getVectorNumElements(Type)` - returns the number
347    of elements in any vector type compatible with the LLVM dialect;
348-   `Type LLVM::getFixedVectorType(Type, unsigned)` - gets a fixed vector type
349    with the given element type and size; the resulting type is either a
350    built-in or an LLVM dialect vector type depending on which one supports the
351    given element type.
352
353#### Examples of Compatible Vector Types
354
355```mlir
356vector<42 x i32>                   // Vector of 42 32-bit integers.
357!llvm.vec<42 x ptr<i32>>           // Vector of 42 pointers to 32-bit integers.
358!llvm.vec<? x 4 x i32>             // Scalable vector of 32-bit integers with
359                                   // size divisible by 4.
360!llvm.array<2 x vector<2 x i32>>   // Array of 2 vectors of 2 32-bit integers.
361!llvm.array<2 x vec<2 x ptr<i32>>> // Array of 2 vectors of 2 pointers to 32-bit
362                                   // integers.
363```
364
365### Structure Types
366
367The structure type is used to represent a collection of data members together in
368memory. The elements of a structure may be any type that has a size.
369
370Structure types are represented in a single dedicated class
371mlir::LLVM::LLVMStructType. Internally, the struct type stores a (potentially
372empty) name, a (potentially empty) list of contained types and a bitmask
373indicating whether the struct is named, opaque, packed or uninitialized.
374Structure types that don't have a name are referred to as _literal_ structs.
375Such structures are uniquely identified by their contents. _Identified_ structs
376on the other hand are uniquely identified by the name.
377
378#### Identified Structure Types
379
380Identified structure types are uniqued using their name in a given context.
381Attempting to construct an identified structure with the same name a structure
382that already exists in the context *will result in the existing structure being
383returned*. **MLIR does not auto-rename identified structs in case of name
384conflicts** because there is no naming scope equivalent to a module in LLVM IR
385since MLIR modules can be arbitrarily nested.
386
387Programmatically, identified structures can be constructed in an _uninitialized_
388state. In this case, they are given a name but the body must be set up by a
389later call, using MLIR's type mutation mechanism. Such uninitialized types can
390be used in type construction, but must be eventually initialized for IR to be
391valid. This mechanism allows for constructing _recursive_ or mutually referring
392structure types: an uninitialized type can be used in its own initialization.
393
394Once the type is initialized, its body cannot be changed anymore. Any further
395attempts to modify the body will fail and return failure to the caller _unless
396the type is initialized with the exact same body_. Type initialization is
397thread-safe; however, if a concurrent thread initializes the type before the
398current thread, the initialization may return failure.
399
400The syntax for identified structure types is as follows.
401
402```
403llvm-ident-struct-type ::= `!llvm.struct<` string-literal, `opaque` `>`
404                         | `!llvm.struct<` string-literal, `packed`?
405                           `(` type-or-ref-list  `)` `>`
406type-or-ref-list ::= <maybe empty comma-separated list of type-or-ref>
407type-or-ref ::= <any compatible type with optional !llvm.>
408              | `!llvm.`? `struct<` string-literal `>`
409```
410
411The body of the identified struct is printed in full unless the it is
412transitively contained in the same struct. In the latter case, only the
413identifier is printed. For example, the structure containing the pointer to
414itself is represented as `!llvm.struct<"A", (ptr<"A">)>`, and the structure `A`
415containing two pointers to the structure `B` containing a pointer to the
416structure `A` is represented as `!llvm.struct<"A", (ptr<"B", (ptr<"A">)>,
417ptr<"B", (ptr<"A">))>`. Note that the structure `B` is "unrolled" for both
418elements. _A structure with the same name but different body is a syntax error._
419**The user must ensure structure name uniqueness across all modules processed in
420a given MLIR context.** Structure names are arbitrary string literals and may
421include, e.g., spaces and keywords.
422
423Identified structs may be _opaque_. In this case, the body is unknown but the
424structure type is considered _initialized_ and is valid in the IR.
425
426#### Literal Structure Types
427
428Literal structures are uniqued according to the list of elements they contain,
429and can optionally be packed. The syntax for such structs is as follows.
430
431```
432llvm-literal-struct-type ::= `!llvm.struct<` `packed`? `(` type-list `)` `>`
433type-list ::= <maybe empty comma-separated list of types with optional !llvm.>
434```
435
436Literal structs cannot be recursive, but can contain other structs. Therefore,
437they must be constructed in a single step with the entire list of contained
438elements provided.
439
440#### Examples of Structure Types
441
442```mlir
443!llvm.struct<>                  // NOT allowed
444!llvm.struct<()>                // empty, literal
445!llvm.struct<(i32)>             // literal
446!llvm.struct<(struct<(i32)>)>   // struct containing a struct
447!llvm.struct<packed (i8, i32)>  // packed struct
448!llvm.struct<"a">               // recursive reference, only allowed within
449                                // another struct, NOT allowed at top level
450!llvm.struct<"a", ptr<struct<"a">>>  // supported example of recursive reference
451!llvm.struct<"a", ()>           // empty, named (necessary to differentiate from
452                                // recursive reference)
453!llvm.struct<"a", opaque>       // opaque, named
454!llvm.struct<"a", (i32)>        // named
455!llvm.struct<"a", packed (i8, i32)>  // named, packed
456```
457
458### Unsupported Types
459
460LLVM IR `label` type does not have a counterpart in the LLVM dialect since, in
461MLIR, blocks are not values and don't need a type.
462
463## Operations
464
465All operations in the LLVM IR dialect have a custom form in MLIR. The mnemonic
466of an operation is that used in LLVM IR prefixed with "`llvm.`".
467
468[include "Dialects/LLVMOps.md"]
469