1# Chapter 2: Emitting Basic MLIR
2
3[TOC]
4
5Now that we're familiar with our language and the AST, let's see how MLIR can
6help to compile Toy.
7
8## Introduction: Multi-Level Intermediate Representation
9
10Other compilers, like LLVM (see the
11[Kaleidoscope tutorial](https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/index.html)),
12offer a fixed set of predefined types and (usually *low-level* / RISC-like)
13instructions. It is up to the frontend for a given language to perform any
14language-specific type-checking, analysis, or transformation before emitting
15LLVM IR. For example, Clang will use its AST to perform not only static analysis
16but also transformations, such as C++ template instantiation through AST cloning
17and rewrite. Finally, languages with construction at a higher-level than C/C++
18may require non-trivial lowering from their AST to generate LLVM IR.
19
20As a consequence, multiple frontends end up reimplementing significant pieces of
21infrastructure to support the need for these analyses and transformation. MLIR
22addresses this issue by being designed for extensibility. As such, there are few
23pre-defined instructions (*operations* in MLIR terminology) or types.
24
25## Interfacing with MLIR
26
27[Language reference](../../LangRef.md)
28
29MLIR is designed to be a completely extensible infrastructure; there is no
30closed set of attributes (think: constant metadata), operations, or types. MLIR
31supports this extensibility with the concept of
32[Dialects](../../LangRef.md#dialects). Dialects provide a grouping mechanism for
33abstraction under a unique `namespace`.
34
35In MLIR, [`Operations`](../../LangRef.md#operations) are the core unit of
36abstraction and computation, similar in many ways to LLVM instructions.
37Operations can have application-specific semantics and can be used to represent
38all of the core IR structures in LLVM: instructions, globals (like functions),
39modules, etc.
40
41Here is the MLIR assembly for the Toy `transpose` operations:
42
43```mlir
44%t_tensor = "toy.transpose"(%tensor) {inplace = true} : (tensor<2x3xf64>) -> tensor<3x2xf64> loc("example/file/path":12:1)
45```
46
47Let's break down the anatomy of this MLIR operation:
48
49-   `%t_tensor`
50
51    *   The name given to the result defined by this operation (which includes
52        [a prefixed sigil to avoid collisions](../../LangRef.md#identifiers-and-keywords)).
53        An operation may define zero or more results (in the context of Toy, we
54        will limit ourselves to single-result operations), which are SSA values.
55        The name is used during parsing but is not persistent (e.g., it is not
56        tracked in the in-memory representation of the SSA value).
57
58-   `"toy.transpose"`
59
60    *   The name of the operation. It is expected to be a unique string, with
61        the namespace of the dialect prefixed before the "`.`". This can be read
62        as the `transpose` operation in the `toy` dialect.
63
64-   `(%tensor)`
65
66    *   A list of zero or more input operands (or arguments), which are SSA
67        values defined by other operations or referring to block arguments.
68
69-   `{ inplace = true }`
70
71    *   A dictionary of zero or more attributes, which are special operands that
72        are always constant. Here we define a boolean attribute named 'inplace'
73        that has a constant value of true.
74
75-   `(tensor<2x3xf64>) -> tensor<3x2xf64>`
76
77    *   This refers to the type of the operation in a functional form, spelling
78        the types of the arguments in parentheses and the type of the return
79        values afterward.
80
81-   `loc("example/file/path":12:1)`
82
83    *   This is the location in the source code from which this operation
84        originated.
85
86Shown here is the general form of an operation. As described above,
87the set of operations in MLIR is extensible. Operations are modeled
88using a small set of concepts, enabling operations to be reasoned
89about and manipulated generically. These concepts are:
90
91-   A name for the operation.
92-   A list of SSA operand values.
93-   A list of [attributes](../../LangRef.md#attributes).
94-   A list of [types](../../LangRef.md#type-system) for result values.
95-   A [source location](../../Diagnostics.md#source-locations) for debugging
96    purposes.
97-   A list of successors [blocks](../../LangRef.md#blocks) (for branches,
98    mostly).
99-   A list of [regions](../../LangRef.md#regions) (for structural operations
100    like functions).
101
102In MLIR, every operation has a mandatory source location associated with it.
103Contrary to LLVM, where debug info locations are metadata and can be dropped, in
104MLIR, the location is a core requirement, and APIs depend on and manipulate it.
105Dropping a location is thus an explicit choice which cannot happen by mistake.
106
107To provide an illustration: If a transformation replaces an operation by
108another, that new operation must still have a location attached. This makes it
109possible to track where that operation came from.
110
111It's worth noting that the mlir-opt tool - a tool for testing
112compiler passes - does not include locations in the output by default. The
113`-mlir-print-debuginfo` flag specifies to include locations. (Run `mlir-opt
114--help` for more options.)
115
116### Opaque API
117
118MLIR is designed to allow most IR elements, such as attributes,
119operations, and types, to be customized. At the same time, IR
120elements can always be reduced to the above fundamental concepts. This
121allows MLIR to parse, represent, and
122[round-trip](../../../getting_started/Glossary.md#round-trip) IR for
123*any* operation. For example, we could place our Toy operation from
124above into an `.mlir` file and round-trip through *mlir-opt* without
125registering any dialect:
126
127```mlir
128func @toy_func(%tensor: tensor<2x3xf64>) -> tensor<3x2xf64> {
129  %t_tensor = "toy.transpose"(%tensor) { inplace = true } : (tensor<2x3xf64>) -> tensor<3x2xf64>
130  return %t_tensor : tensor<3x2xf64>
131}
132```
133
134In the cases of unregistered attributes, operations, and types, MLIR
135will enforce some structural constraints (SSA, block termination,
136etc.), but otherwise they are completely opaque. For instance, MLIR
137has little information about whether an unregistered operation can
138operate on particular datatypes, how many operands it can take, or how
139many results it produces. This flexibility can be useful for
140bootstrapping purposes, but it is generally advised against in mature
141systems. Unregistered operations must be treated conservatively by
142transformations and analyses, and they are much harder to construct
143and manipulate.
144
145This handling can be observed by crafting what should be an invalid IR for Toy
146and seeing it round-trip without tripping the verifier:
147
148```mlir
149func @main() {
150  %0 = "toy.print"() : () -> tensor<2x3xf64>
151}
152```
153
154There are multiple problems here: the `toy.print` operation is not a terminator;
155it should take an operand; and it shouldn't return any values. In the next
156section, we will register our dialect and operations with MLIR, plug into the
157verifier, and add nicer APIs to manipulate our operations.
158
159## Defining a Toy Dialect
160
161To effectively interface with MLIR, we will define a new Toy dialect. This
162dialect will model the structure of the Toy language, as well as
163provide an easy avenue for high-level analysis and transformation.
164
165```c++
166/// This is the definition of the Toy dialect. A dialect inherits from
167/// mlir::Dialect and registers custom attributes, operations, and types (in its
168/// constructor). It can also override virtual methods to change some general
169/// behavior, which will be demonstrated in later chapters of the tutorial.
170class ToyDialect : public mlir::Dialect {
171 public:
172  explicit ToyDialect(mlir::MLIRContext *ctx);
173
174  /// Provide a utility accessor to the dialect namespace. This is used by
175  /// several utilities.
176  static llvm::StringRef getDialectNamespace() { return "toy"; }
177};
178```
179
180The dialect can now be registered in the global registry:
181
182```c++
183  mlir::registerDialect<ToyDialect>();
184```
185
186Any new `MLIRContext` created from now on will contain an instance of the Toy
187dialect and invoke specific hooks for things like parsing attributes and types.
188
189## Defining Toy Operations
190
191Now that we have a `Toy` dialect, we can start registering operations. This will
192allow for providing semantic information that the rest of the system can hook
193into. Let's walk through the creation of the `toy.constant` operation:
194
195```mlir
196 %4 = "toy.constant"() {value = dense<1.0> : tensor<2x3xf64>} : () -> tensor<2x3xf64>
197```
198
199This operation takes zero operands, a
200[dense elements](../../LangRef.md#dense-elements-attribute) attribute named
201`value`, and returns a single result of
202[TensorType](../../LangRef.md#tensor-type). An operation inherits from the
203[CRTP](https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern)
204`mlir::Op` class which also takes some optional [*traits*](../../Traits.md) to
205customize its behavior. These traits may provide additional accessors,
206verification, etc.
207
208```c++
209class ConstantOp : public mlir::Op<ConstantOp,
210                     /// The ConstantOp takes no inputs.
211                     mlir::OpTrait::ZeroOperands,
212                     /// The ConstantOp returns a single result.
213                     mlir::OpTrait::OneResult> {
214
215 public:
216  /// Inherit the constructors from the base Op class.
217  using Op::Op;
218
219  /// Provide the unique name for this operation. MLIR will use this to register
220  /// the operation and uniquely identify it throughout the system.
221  static llvm::StringRef getOperationName() { return "toy.constant"; }
222
223  /// Return the value of the constant by fetching it from the attribute.
224  mlir::DenseElementsAttr getValue();
225
226  /// Operations can provide additional verification beyond the traits they
227  /// define. Here we will ensure that the specific invariants of the constant
228  /// operation are upheld, for example the result type must be of TensorType.
229  LogicalResult verify();
230
231  /// Provide an interface to build this operation from a set of input values.
232  /// This interface is used by the builder to allow for easily generating
233  /// instances of this operation:
234  ///   mlir::OpBuilder::create<ConstantOp>(...)
235  /// This method populates the given `state` that MLIR uses to create
236  /// operations. This state is a collection of all of the discrete elements
237  /// that an operation may contain.
238  /// Build a constant with the given return type and `value` attribute.
239  static void build(mlir::OpBuilder &builder, mlir::OperationState &state,
240                    mlir::Type result, mlir::DenseElementsAttr value);
241  /// Build a constant and reuse the type from the given 'value'.
242  static void build(mlir::OpBuilder &builder, mlir::OperationState &state,
243                    mlir::DenseElementsAttr value);
244  /// Build a constant by broadcasting the given 'value'.
245  static void build(mlir::OpBuilder &builder, mlir::OperationState &state,
246                    double value);
247};
248```
249
250and we register this operation in the `ToyDialect` constructor:
251
252```c++
253ToyDialect::ToyDialect(mlir::MLIRContext *ctx)
254    : mlir::Dialect(getDialectNamespace(), ctx) {
255  addOperations<ConstantOp>();
256}
257```
258
259### Op vs Operation: Using MLIR Operations
260
261Now that we have defined an operation, we will want to access and
262transform it.  In MLIR, there are two main classes related to
263operations: `Operation` and `Op`.  The `Operation` class is used to
264generically model all operations.  It is 'opaque', in the sense that
265it does not describe the properties of particular operations or types
266of operations.  Instead, the 'Operation' class provides a general API
267into an operation instance.  On the other hand, each specific type of
268operation is represented by an `Op` derived class.  For instance
269`ConstantOp` represents a operation with zero inputs, and one output,
270which is always set to the same value.  `Op` derived classes act as
271smart pointer wrapper around a `Operation*`, provide
272operation-specific accessor methods, and type-safe properties of
273operations. This means that when we define our Toy operations, we are
274simply defining a clean, semantically useful interface for building
275and interfacing with the `Operation` class.  This is why our
276`ConstantOp` defines no class fields; all the data structures are
277stored in the referenced `Operation`.  A side effect is that we always
278pass around `Op` derived classes by value, instead of by reference or
279pointer (*passing by value* is a common idiom and applies similarly to
280attributes, types, etc).  Given a generic `Operation*` instance, we
281can always get a specific `Op` instance using LLVM's casting
282infrastructure:
283
284```c++
285void processConstantOp(mlir::Operation *operation) {
286  ConstantOp op = llvm::dyn_cast<ConstantOp>(operation);
287
288  // This operation is not an instance of `ConstantOp`.
289  if (!op)
290    return;
291
292  // Get the internal operation instance wrapped by the smart pointer.
293  mlir::Operation *internalOperation = op.getOperation();
294  assert(internalOperation == operation &&
295         "these operation instances are the same");
296}
297```
298
299### Using the Operation Definition Specification (ODS) Framework
300
301In addition to specializing the `mlir::Op` C++ template, MLIR also supports
302defining operations in a declarative manner. This is achieved via the
303[Operation Definition Specification](../../OpDefinitions.md) framework. Facts
304regarding an operation are specified concisely into a TableGen record, which
305will be expanded into an equivalent `mlir::Op` C++ template specialization at
306compile time. Using the ODS framework is the desired way for defining operations
307in MLIR given the simplicity, conciseness, and general stability in the face of
308C++ API changes.
309
310Lets see how to define the ODS equivalent of our ConstantOp:
311
312The first thing to do is to define a link to the Toy dialect that we defined in
313C++. This is used to link all of the operations that we will define to our
314dialect:
315
316```tablegen
317// Provide a definition of the 'toy' dialect in the ODS framework so that we
318// can define our operations.
319def Toy_Dialect : Dialect {
320  // The namespace of our dialect, this corresponds 1-1 with the string we
321  // provided in `ToyDialect::getDialectNamespace`.
322  let name = "toy";
323
324  // The C++ namespace that the dialect class definition resides in.
325  let cppNamespace = "toy";
326}
327```
328
329Now that we have defined a link to the Toy dialect, we can start defining
330operations. Operations in ODS are defined by inheriting from the `Op` class. To
331simplify our operation definitions, we will define a base class for operations
332in the Toy dialect.
333
334```tablegen
335// Base class for toy dialect operations. This operation inherits from the base
336// `Op` class in OpBase.td, and provides:
337//   * The parent dialect of the operation.
338//   * The mnemonic for the operation, or the name without the dialect prefix.
339//   * A list of traits for the operation.
340class Toy_Op<string mnemonic, list<OpTrait> traits = []> :
341    Op<Toy_Dialect, mnemonic, traits>;
342```
343
344With all of the preliminary pieces defined, we can begin to define the constant
345operation.
346
347We define a toy operation by inheriting from our base 'Toy_Op' class above. Here
348we provide the mnemonic and a list of traits for the operation. The
349[mnemonic](../../OpDefinitions.md#operation-name) here matches the one given in
350`ConstantOp::getOperationName` without the dialect prefix; `toy.`. Missing here
351from our C++ definition are the `ZeroOperands` and `OneResult` traits; these
352will be automatically inferred based upon the `arguments` and `results` fields
353we define later.
354
355```tablegen
356def ConstantOp : Toy_Op<"constant"> {
357}
358```
359
360At this point you probably might want to know what the C++ code generated by
361TableGen looks like. Simply run the `mlir-tblgen` command with the
362`gen-op-decls` or the `gen-op-defs` action like so:
363
364```shell
365${build_root}/bin/mlir-tblgen -gen-op-defs ${mlir_src_root}/examples/toy/Ch2/include/toy/Ops.td -I ${mlir_src_root}/include/
366```
367
368Depending on the selected action, this will print either the `ConstantOp` class
369declaration or its implementation. Comparing this output to the hand-crafted
370implementation is incredibly useful when getting started with TableGen.
371
372#### Defining Arguments and Results
373
374With the shell of the operation defined, we can now provide the
375[inputs](../../OpDefinitions.md#operation-arguments) and
376[outputs](../../OpDefinitions.md#operation-results) to our operation. The
377inputs, or arguments, to an operation may be attributes or types for SSA operand
378values. The results correspond to a set of types for the values produced by the
379operation:
380
381```tablegen
382def ConstantOp : Toy_Op<"constant"> {
383  // The constant operation takes an attribute as the only input.
384  // `F64ElementsAttr` corresponds to a 64-bit floating-point ElementsAttr.
385  let arguments = (ins F64ElementsAttr:$value);
386
387  // The constant operation returns a single value of TensorType.
388  // F64Tensor corresponds to a 64-bit floating-point TensorType.
389  let results = (outs F64Tensor);
390}
391```
392
393By providing a name to the arguments or results, e.g. `$value`, ODS will
394automatically generate a matching accessor: `DenseElementsAttr
395ConstantOp::value()`.
396
397#### Adding Documentation
398
399The next step after defining the operation is to document it. Operations may
400provide
401[`summary` and `description`](../../OpDefinitions.md#operation-documentation)
402fields to describe the semantics of the operation. This information is useful
403for users of the dialect and can even be used to auto-generate Markdown
404documents.
405
406```tablegen
407def ConstantOp : Toy_Op<"constant"> {
408  // Provide a summary and description for this operation. This can be used to
409  // auto-generate documentation of the operations within our dialect.
410  let summary = "constant operation";
411  let description = [{
412    Constant operation turns a literal into an SSA value. The data is attached
413    to the operation as an attribute. For example:
414
415      %0 = "toy.constant"()
416         { value = dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]> : tensor<2x3xf64> }
417        : () -> tensor<2x3xf64>
418  }];
419
420  // The constant operation takes an attribute as the only input.
421  // `F64ElementsAttr` corresponds to a 64-bit floating-point ElementsAttr.
422  let arguments = (ins F64ElementsAttr:$value);
423
424  // The generic call operation returns a single value of TensorType.
425  // F64Tensor corresponds to a 64-bit floating-point TensorType.
426  let results = (outs F64Tensor);
427}
428```
429
430#### Verifying Operation Semantics
431
432At this point we've already covered a majority of the original C++ operation
433definition. The next piece to define is the verifier. Luckily, much like the
434named accessor, the ODS framework will automatically generate a lot of the
435necessary verification logic based upon the constraints we have given. This
436means that we don't need to verify the structure of the return type, or even the
437input attribute `value`. In many cases, additional verification is not even
438necessary for ODS operations. To add additional verification logic, an operation
439can override the [`verifier`](../../OpDefinitions.md#custom-verifier-code)
440field. The `verifier` field allows for defining a C++ code blob that will be run
441as part of `ConstantOp::verify`. This blob can assume that all of the other
442invariants of the operation have already been verified:
443
444```tablegen
445def ConstantOp : Toy_Op<"constant"> {
446  // Provide a summary and description for this operation. This can be used to
447  // auto-generate documentation of the operations within our dialect.
448  let summary = "constant operation";
449  let description = [{
450    Constant operation turns a literal into an SSA value. The data is attached
451    to the operation as an attribute. For example:
452
453      %0 = "toy.constant"()
454         { value = dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]> : tensor<2x3xf64> }
455        : () -> tensor<2x3xf64>
456  }];
457
458  // The constant operation takes an attribute as the only input.
459  // `F64ElementsAttr` corresponds to a 64-bit floating-point ElementsAttr.
460  let arguments = (ins F64ElementsAttr:$value);
461
462  // The generic call operation returns a single value of TensorType.
463  // F64Tensor corresponds to a 64-bit floating-point TensorType.
464  let results = (outs F64Tensor);
465
466  // Add additional verification logic to the constant operation. Here we invoke
467  // a static `verify` method in a C++ source file. This codeblock is executed
468  // inside of ConstantOp::verify, so we can use `this` to refer to the current
469  // operation instance.
470  let verifier = [{ return ::verify(*this); }];
471}
472```
473
474#### Attaching `build` Methods
475
476The final missing component here from our original C++ example are the `build`
477methods. ODS can generate some simple build methods automatically, and in this
478case it will generate our first build method for us. For the rest, we define the
479[`builders`](../../OpDefinitions.md#custom-builder-methods) field. This field
480takes a list of `OpBuilder` objects that take a string corresponding to a list
481of C++ parameters, as well as an optional code block that can be used to specify
482the implementation inline.
483
484```tablegen
485def ConstantOp : Toy_Op<"constant"> {
486  ...
487
488  // Add custom build methods for the constant operation. These methods populate
489  // the `state` that MLIR uses to create operations, i.e. these are used when
490  // using `builder.create<ConstantOp>(...)`.
491  let builders = [
492    // Build a constant with a given constant tensor value.
493    OpBuilder<"OpBuilder &builder, OperationState &result, "
494              "DenseElementsAttr value", [{
495      // Call into an autogenerated `build` method.
496      build(builder, result, value.getType(), value);
497    }]>,
498
499    // Build a constant with a given constant floating-point value. This builder
500    // creates a declaration for `ConstantOp::build` with the given parameters.
501    OpBuilder<"OpBuilder &builder, OperationState &result, double value">
502  ];
503}
504```
505
506#### Specifying a Custom Assembly Format
507
508At this point we can generate our "Toy IR". For example, the following:
509
510```toy
511# User defined generic function that operates on unknown shaped arguments.
512def multiply_transpose(a, b) {
513  return transpose(a) * transpose(b);
514}
515
516def main() {
517  var a<2, 3> = [[1, 2, 3], [4, 5, 6]];
518  var b<2, 3> = [1, 2, 3, 4, 5, 6];
519  var c = multiply_transpose(a, b);
520  var d = multiply_transpose(b, a);
521  print(d);
522}
523```
524
525Results in the following IR:
526
527```mlir
528module {
529  func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> {
530    %0 = "toy.transpose"(%arg0) : (tensor<*xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:10)
531    %1 = "toy.transpose"(%arg1) : (tensor<*xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:25)
532    %2 = "toy.mul"(%0, %1) : (tensor<*xf64>, tensor<*xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:25)
533    "toy.return"(%2) : (tensor<*xf64>) -> () loc("test/Examples/Toy/Ch2/codegen.toy":5:3)
534  } loc("test/Examples/Toy/Ch2/codegen.toy":4:1)
535  func @main() {
536    %0 = "toy.constant"() {value = dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>} : () -> tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":9:17)
537    %1 = "toy.reshape"(%0) : (tensor<2x3xf64>) -> tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":9:3)
538    %2 = "toy.constant"() {value = dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64>} : () -> tensor<6xf64> loc("test/Examples/Toy/Ch2/codegen.toy":10:17)
539    %3 = "toy.reshape"(%2) : (tensor<6xf64>) -> tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":10:3)
540    %4 = "toy.generic_call"(%1, %3) {callee = @multiply_transpose} : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":11:11)
541    %5 = "toy.generic_call"(%3, %1) {callee = @multiply_transpose} : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":12:11)
542    "toy.print"(%5) : (tensor<*xf64>) -> () loc("test/Examples/Toy/Ch2/codegen.toy":13:3)
543    "toy.return"() : () -> () loc("test/Examples/Toy/Ch2/codegen.toy":8:1)
544  } loc("test/Examples/Toy/Ch2/codegen.toy":8:1)
545} loc(unknown)
546```
547
548One thing to notice here is that all of our Toy operations are printed using the
549generic assembly format. This format is the one shown when breaking down
550`toy.transpose` at the beginning of this chapter. MLIR allows for operations to
551define their own custom assembly format, either
552[declaratively](../../OpDefinitions.md#declarative-assembly-format) or
553imperatively via C++. Defining a custom assembly format allows for tailoring the
554generated IR into something a bit more readable by removing a lot of the fluff
555that is required by the generic format. Let's walk through an example of an
556operation format that we would like to simplify.
557
558##### `toy.print`
559
560The current form of `toy.print` is a little verbose. There are a lot of
561additional characters that we would like to strip away. Let's begin by thinking
562of what a good format of `toy.print` would be, and see how we can implement it.
563Looking at the basics of `toy.print` we get:
564
565```mlir
566toy.print %5 : tensor<*xf64> loc(...)
567```
568
569Here we have stripped much of the format down to the bare essentials, and it has
570become much more readable. To provide a custom assembly format, an operation can
571either override the `parser` and `printer` fields for a C++ format, or the
572`assemblyFormat` field for the declarative format. Let's look at the C++ variant
573first, as this is what the declarative format maps to internally.
574
575```tablegen
576/// Consider a stripped definition of `toy.print` here.
577def PrintOp : Toy_Op<"print"> {
578  let arguments = (ins F64Tensor:$input);
579
580  // Divert the printer and parser to static functions in our .cpp
581  // file that correspond to 'print' and 'printPrintOp'. 'printer' and 'parser'
582  // here correspond to an instance of a 'OpAsmParser' and 'OpAsmPrinter'. More
583  // details on these classes is shown below.
584  let printer = [{ return ::print(printer, *this); }];
585  let parser = [{ return ::parse$cppClass(parser, result); }];
586}
587```
588
589A C++ implementation for the printer and parser is shown below:
590
591```c++
592/// The 'OpAsmPrinter' class is a stream that will allows for formatting
593/// strings, attributes, operands, types, etc.
594static void print(mlir::OpAsmPrinter &printer, PrintOp op) {
595  printer << "toy.print " << op.input();
596  printer.printOptionalAttrDict(op.getAttrs());
597  printer << " : " << op.input().getType();
598}
599
600/// The 'OpAsmParser' class provides a collection of methods for parsing
601/// various punctuation, as well as attributes, operands, types, etc. Each of
602/// these methods returns a `ParseResult`. This class is a wrapper around
603/// `LogicalResult` that can be converted to a boolean `true` value on failure,
604/// or `false` on success. This allows for easily chaining together a set of
605/// parser rules. These rules are used to populate an `mlir::OperationState`
606/// similarly to the `build` methods described above.
607static mlir::ParseResult parsePrintOp(mlir::OpAsmParser &parser,
608                                      mlir::OperationState &result) {
609  // Parse the input operand, the attribute dictionary, and the type of the
610  // input.
611  mlir::OpAsmParser::OperandType inputOperand;
612  mlir::Type inputType;
613  if (parser.parseOperand(inputOperand) ||
614      parser.parseOptionalAttrDict(result.attributes) || parser.parseColon() ||
615      parser.parseType(inputType))
616    return mlir::failure();
617
618  // Resolve the input operand to the type we parsed in.
619  if (parser.resolveOperand(inputOperand, inputType, result.operands))
620    return mlir::failure();
621
622  return mlir::success();
623}
624```
625
626With the C++ implementation defined, let's see how this can be mapped to the
627[declarative format](../../OpDefinitions.md#declarative-assembly-format). The
628declarative format is largely composed of three different components:
629
630*   Directives
631    -   A type of builtin function, with an optional set of arguments.
632*   Literals
633    -   A keyword or punctuation surrounded by \`\`.
634*   Variables
635    -   An entity that has been registered on the operation itself, i.e. an
636        argument(attribute or operand), result, successor, etc. In the `PrintOp`
637        example above, a variable would be `$input`.
638
639A direct mapping of our C++ format looks something like:
640
641```tablegen
642/// Consider a stripped definition of `toy.print` here.
643def PrintOp : Toy_Op<"print"> {
644  let arguments = (ins F64Tensor:$input);
645
646  // In the following format we have two directives, `attr-dict` and `type`.
647  // These correspond to the attribute dictionary and the type of a given
648  // variable represectively.
649  let assemblyFormat = "$input attr-dict `:` type($input)";
650}
651```
652
653The [declarative format](../../OpDefinitions.md#declarative-assembly-format) has
654many more interesting features, so be sure to check it out before implementing a
655custom format in C++. After beautifying the format of a few of our operations we
656now get a much more readable:
657
658```mlir
659module {
660  func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> {
661    %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:10)
662    %1 = toy.transpose(%arg1 : tensor<*xf64>) to tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:25)
663    %2 = toy.mul %0, %1 : tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:25)
664    toy.return %2 : tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:3)
665  } loc("test/Examples/Toy/Ch2/codegen.toy":4:1)
666  func @main() {
667    %0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":9:17)
668    %1 = toy.reshape(%0 : tensor<2x3xf64>) to tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":9:3)
669    %2 = toy.constant dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64> loc("test/Examples/Toy/Ch2/codegen.toy":10:17)
670    %3 = toy.reshape(%2 : tensor<6xf64>) to tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":10:3)
671    %4 = toy.generic_call @multiply_transpose(%1, %3) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":11:11)
672    %5 = toy.generic_call @multiply_transpose(%3, %1) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":12:11)
673    toy.print %5 : tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":13:3)
674    toy.return loc("test/Examples/Toy/Ch2/codegen.toy":8:1)
675  } loc("test/Examples/Toy/Ch2/codegen.toy":8:1)
676} loc(unknown)
677```
678
679Above we introduce several of the concepts for defining operations in the ODS
680framework, but there are many more that we haven't had a chance to: regions,
681variadic operands, etc. Check out the
682[full specification](../../OpDefinitions.md) for more details.
683
684## Complete Toy Example
685
686We can now generate our "Toy IR". You can build `toyc-ch2` and try yourself on
687the above example: `toyc-ch2 test/Examples/Toy/Ch2/codegen.toy -emit=mlir
688-mlir-print-debuginfo`. We can also check our RoundTrip: `toyc-ch2
689test/Examples/Toy/Ch2/codegen.toy -emit=mlir -mlir-print-debuginfo 2>
690codegen.mlir` followed by `toyc-ch2 codegen.mlir -emit=mlir`. You should also
691use `mlir-tblgen` on the final definition file and study the generated C++ code.
692
693At this point, MLIR knows about our Toy dialect and operations. In the
694[next chapter](Ch-3.md), we will leverage our new dialect to implement some
695high-level language-specific analyses and transformations for the Toy language.
696