106f32e7eSjoerg===============================
206f32e7eSjoergASTImporter: Merging Clang ASTs
306f32e7eSjoerg===============================
406f32e7eSjoerg
506f32e7eSjoergThe ``ASTImporter`` class is part of Clang's core library, the AST library.
606f32e7eSjoergIt imports nodes of an ``ASTContext`` into another ``ASTContext``.
706f32e7eSjoerg
806f32e7eSjoergIn this document, we assume basic knowledge about the Clang AST.  See the :doc:`Introduction
906f32e7eSjoergto the Clang AST <IntroductionToTheClangAST>` if you want to learn more
1006f32e7eSjoergabout how the AST is structured.
1106f32e7eSjoergKnowledge about :doc:`matching the Clang AST <LibASTMatchers>` and the `reference for the matchers <https://clang.llvm.org/docs/LibASTMatchersReference.html>`_ are also useful.
1206f32e7eSjoerg
1306f32e7eSjoerg.. contents::
1406f32e7eSjoerg   :local:
1506f32e7eSjoerg
1606f32e7eSjoergIntroduction
1706f32e7eSjoerg------------
1806f32e7eSjoerg
1906f32e7eSjoerg``ASTContext`` holds long-lived AST nodes (such as types and decls) that can be referred to throughout the semantic analysis of a file.
2006f32e7eSjoergIn some cases it is preferable to work with more than one ``ASTContext``.
2106f32e7eSjoergFor example, we'd like to parse multiple different files inside the same Clang tool.
2206f32e7eSjoergIt may be convenient if we could view the set of the resulting ASTs as if they were one AST resulting from the parsing of each file together.
2306f32e7eSjoerg``ASTImporter`` provides the way to copy types or declarations from one ``ASTContext`` to another.
2406f32e7eSjoergWe refer to the context from which we import as the **"from" context** or *source context*; and the context into which we import as the **"to" context** or *destination context*.
2506f32e7eSjoerg
2606f32e7eSjoergExisting clients of the ``ASTImporter`` library are Cross Translation Unit (CTU) static analysis and the LLDB expression parser.
2706f32e7eSjoergCTU static analysis imports a definition of a function if its definition is found in another translation unit (TU).
2806f32e7eSjoergThis way the analysis can breach out from the single TU limitation.
2906f32e7eSjoergLLDB's ``expr`` command parses a user-defined expression, creates an ``ASTContext`` for that and then imports the missing definitions from the AST what we got from the debug information (DWARF, etc).
3006f32e7eSjoerg
3106f32e7eSjoergAlgorithm of the import
3206f32e7eSjoerg-----------------------
3306f32e7eSjoerg
3406f32e7eSjoergImporting one AST node copies that node into the destination ``ASTContext``.
3506f32e7eSjoergWhy do we have to copy the node?
3606f32e7eSjoergIsn't enough to insert the pointer to that node into the destination context?
3706f32e7eSjoergOne reason is that the "from" context may outlive the "to" context.
3806f32e7eSjoergAlso, the Clang AST consider nodes (or certain properties of nodes) equivalent if they have the same address!
3906f32e7eSjoerg
4006f32e7eSjoergThe import algorithm has to ensure that the structurally equivalent nodes in the different translation units are not getting duplicated in the merged AST.
4106f32e7eSjoergE.g. if we include the definition of the vector template (``#include <vector>``) in two translation units, then their merged AST should have only one node which represents the template.
4206f32e7eSjoergAlso, we have to discover *one definition rule* (ODR) violations.
4306f32e7eSjoergFor instance, if there is a class definition with the same name in both translation units, but one of the definition contains a different number of fields.
4406f32e7eSjoergSo, we look up existing definitions, and then we check the structural equivalency on those nodes.
4506f32e7eSjoergThe following pseudo-code demonstrates the basics of the import mechanism:
4606f32e7eSjoerg
4706f32e7eSjoerg.. code-block:: cpp
4806f32e7eSjoerg
4906f32e7eSjoerg  // Pseudo-code(!) of import:
5006f32e7eSjoerg  ErrorOrDecl Import(Decl *FromD) {
5106f32e7eSjoerg    Decl *ToDecl = nullptr;
5206f32e7eSjoerg    FoundDeclsList = Look up all Decls in the "to" Ctx with the same name of FromD;
5306f32e7eSjoerg    for (auto FoundDecl : FoundDeclsList) {
5406f32e7eSjoerg      if (StructurallyEquivalentDecls(FoundDecl, FromD)) {
5506f32e7eSjoerg        ToDecl = FoundDecl;
5606f32e7eSjoerg        Mark FromD as imported;
5706f32e7eSjoerg        break;
5806f32e7eSjoerg      } else {
5906f32e7eSjoerg        Report ODR violation;
6006f32e7eSjoerg        return error;
6106f32e7eSjoerg      }
6206f32e7eSjoerg    }
6306f32e7eSjoerg    if (FoundDeclsList is empty) {
6406f32e7eSjoerg      Import dependent declarations and types of ToDecl;
6506f32e7eSjoerg      ToDecl = create a new AST node in "to" Ctx;
6606f32e7eSjoerg      Mark FromD as imported;
6706f32e7eSjoerg    }
6806f32e7eSjoerg    return ToDecl;
6906f32e7eSjoerg  }
7006f32e7eSjoerg
7106f32e7eSjoergTwo AST nodes are *structurally equivalent* if they are
7206f32e7eSjoerg
7306f32e7eSjoerg- builtin types and refer to the same type, e.g. ``int`` and ``int`` are structurally equivalent,
7406f32e7eSjoerg- function types and all their parameters have structurally equivalent types,
7506f32e7eSjoerg- record types and all their fields in order of their definition have the same identifier names and structurally equivalent types,
7606f32e7eSjoerg- variable or function declarations and they have the same identifier name and their types are structurally equivalent.
7706f32e7eSjoerg
7806f32e7eSjoergWe could extend the definition of structural equivalency to templates similarly.
7906f32e7eSjoerg
8006f32e7eSjoergIf A and B are AST nodes and *A depends on B*, then we say that A is a **dependant** of B and B is a **dependency** of A.
8106f32e7eSjoergThe words "dependant" and "dependency" are nouns in British English.
8206f32e7eSjoergUnfortunately, in American English, the adjective "dependent" is used for both meanings.
8306f32e7eSjoergIn this document, with the "dependent" adjective we always address the dependencies, the B node in the example.
8406f32e7eSjoerg
8506f32e7eSjoergAPI
8606f32e7eSjoerg---
8706f32e7eSjoerg
8806f32e7eSjoergLet's create a tool which uses the ASTImporter class!
8906f32e7eSjoergFirst, we build two ASTs from virtual files; the content of the virtual files are synthesized from string literals:
9006f32e7eSjoerg
9106f32e7eSjoerg.. code-block:: cpp
9206f32e7eSjoerg
9306f32e7eSjoerg  std::unique_ptr<ASTUnit> ToUnit = buildASTFromCode(
9406f32e7eSjoerg      "", "to.cc"); // empty file
9506f32e7eSjoerg  std::unique_ptr<ASTUnit> FromUnit = buildASTFromCode(
9606f32e7eSjoerg      R"(
9706f32e7eSjoerg      class MyClass {
9806f32e7eSjoerg        int m1;
9906f32e7eSjoerg        int m2;
10006f32e7eSjoerg      };
10106f32e7eSjoerg      )",
10206f32e7eSjoerg      "from.cc");
10306f32e7eSjoerg
10406f32e7eSjoergThe first AST corresponds to the destination ("to") context - which is empty - and the second for the source ("from") context.
10506f32e7eSjoergNext, we define a matcher to match ``MyClass`` in the "from" context:
10606f32e7eSjoerg
10706f32e7eSjoerg.. code-block:: cpp
10806f32e7eSjoerg
10906f32e7eSjoerg  auto Matcher = cxxRecordDecl(hasName("MyClass"));
11006f32e7eSjoerg  auto *From = getFirstDecl<CXXRecordDecl>(Matcher, FromUnit);
11106f32e7eSjoerg
11206f32e7eSjoergNow we create the Importer and do the import:
11306f32e7eSjoerg
11406f32e7eSjoerg.. code-block:: cpp
11506f32e7eSjoerg
11606f32e7eSjoerg  ASTImporter Importer(ToUnit->getASTContext(), ToUnit->getFileManager(),
11706f32e7eSjoerg                       FromUnit->getASTContext(), FromUnit->getFileManager(),
11806f32e7eSjoerg                       /*MinimalImport=*/true);
11906f32e7eSjoerg  llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From);
12006f32e7eSjoerg
12106f32e7eSjoergThe ``Import`` call returns with ``llvm::Expected``, so, we must check for any error.
122*13fbcb42SjoergPlease refer to the `error handling <https://llvm.org/docs/ProgrammersManual.html#recoverable-errors>`_ documentation for details.
12306f32e7eSjoerg
12406f32e7eSjoerg.. code-block:: cpp
12506f32e7eSjoerg
12606f32e7eSjoerg  if (!ImportedOrErr) {
12706f32e7eSjoerg    llvm::Error Err = ImportedOrErr.takeError();
12806f32e7eSjoerg    llvm::errs() << "ERROR: " << Err << "\n";
12906f32e7eSjoerg    consumeError(std::move(Err));
13006f32e7eSjoerg    return 1;
13106f32e7eSjoerg  }
13206f32e7eSjoerg
13306f32e7eSjoergIf there's no error then we can get the underlying value.
13406f32e7eSjoergIn this example we will print the AST of the "to" context.
13506f32e7eSjoerg
13606f32e7eSjoerg.. code-block:: cpp
13706f32e7eSjoerg
13806f32e7eSjoerg  Decl *Imported = *ImportedOrErr;
13906f32e7eSjoerg  Imported->getTranslationUnitDecl()->dump();
14006f32e7eSjoerg
14106f32e7eSjoergSince we set **minimal import** in the constructor of the importer, the AST will not contain the declaration of the members (once we run the test tool).
14206f32e7eSjoerg
14306f32e7eSjoerg.. code-block:: bash
14406f32e7eSjoerg
14506f32e7eSjoerg  TranslationUnitDecl 0x68b9a8 <<invalid sloc>> <invalid sloc>
14606f32e7eSjoerg  `-CXXRecordDecl 0x6c7e30 <line:2:7, col:13> col:13 class MyClass definition
14706f32e7eSjoerg    `-DefinitionData pass_in_registers standard_layout trivially_copyable trivial literal
14806f32e7eSjoerg      |-DefaultConstructor exists trivial needs_implicit
14906f32e7eSjoerg      |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
15006f32e7eSjoerg      |-MoveConstructor exists simple trivial needs_implicit
15106f32e7eSjoerg      |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
15206f32e7eSjoerg      |-MoveAssignment exists simple trivial needs_implicit
15306f32e7eSjoerg      `-Destructor simple irrelevant trivial needs_implicit
15406f32e7eSjoerg
15506f32e7eSjoergWe'd like to get the members too, so, we use ``ImportDefinition`` to copy the whole definition of ``MyClass`` into the "to" context.
15606f32e7eSjoergThen we dump the AST again.
15706f32e7eSjoerg
15806f32e7eSjoerg.. code-block:: cpp
15906f32e7eSjoerg
16006f32e7eSjoerg  if (llvm::Error Err = Importer.ImportDefinition(From)) {
16106f32e7eSjoerg    llvm::errs() << "ERROR: " << Err << "\n";
16206f32e7eSjoerg    consumeError(std::move(Err));
16306f32e7eSjoerg    return 1;
16406f32e7eSjoerg  }
16506f32e7eSjoerg  llvm::errs() << "Imported definition.\n";
16606f32e7eSjoerg  Imported->getTranslationUnitDecl()->dump();
16706f32e7eSjoerg
16806f32e7eSjoergThis time the AST is going to contain the members too.
16906f32e7eSjoerg
17006f32e7eSjoerg.. code-block:: bash
17106f32e7eSjoerg
17206f32e7eSjoerg  TranslationUnitDecl 0x68b9a8 <<invalid sloc>> <invalid sloc>
17306f32e7eSjoerg  `-CXXRecordDecl 0x6c7e30 <line:2:7, col:13> col:13 class MyClass definition
17406f32e7eSjoerg    |-DefinitionData pass_in_registers standard_layout trivially_copyable trivial literal
17506f32e7eSjoerg    | |-DefaultConstructor exists trivial needs_implicit
17606f32e7eSjoerg    | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
17706f32e7eSjoerg    | |-MoveConstructor exists simple trivial needs_implicit
17806f32e7eSjoerg    | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
17906f32e7eSjoerg    | |-MoveAssignment exists simple trivial needs_implicit
18006f32e7eSjoerg    | `-Destructor simple irrelevant trivial needs_implicit
18106f32e7eSjoerg    |-CXXRecordDecl 0x6c7f48 <col:7, col:13> col:13 implicit class MyClass
18206f32e7eSjoerg    |-FieldDecl 0x6c7ff0 <line:3:9, col:13> col:13 m1 'int'
18306f32e7eSjoerg    `-FieldDecl 0x6c8058 <line:4:9, col:13> col:13 m2 'int'
18406f32e7eSjoerg
18506f32e7eSjoergWe can spare the call for ``ImportDefinition`` if we set up the importer to do a "normal" (not minimal) import.
18606f32e7eSjoerg
18706f32e7eSjoerg.. code-block:: cpp
18806f32e7eSjoerg
18906f32e7eSjoerg  ASTImporter Importer( ....  /*MinimalImport=*/false);
19006f32e7eSjoerg
19106f32e7eSjoergWith **normal import**, all dependent declarations are imported normally.
19206f32e7eSjoergHowever, with minimal import, the dependent Decls are imported without definition, and we have to import their definition for each if we later need that.
19306f32e7eSjoerg
19406f32e7eSjoergPutting this all together here is how the source of the tool looks like:
19506f32e7eSjoerg
19606f32e7eSjoerg.. code-block:: cpp
19706f32e7eSjoerg
19806f32e7eSjoerg  #include "clang/AST/ASTImporter.h"
19906f32e7eSjoerg  #include "clang/ASTMatchers/ASTMatchFinder.h"
20006f32e7eSjoerg  #include "clang/ASTMatchers/ASTMatchers.h"
20106f32e7eSjoerg  #include "clang/Tooling/Tooling.h"
20206f32e7eSjoerg
20306f32e7eSjoerg  using namespace clang;
20406f32e7eSjoerg  using namespace tooling;
20506f32e7eSjoerg  using namespace ast_matchers;
20606f32e7eSjoerg
20706f32e7eSjoerg  template <typename Node, typename Matcher>
20806f32e7eSjoerg  Node *getFirstDecl(Matcher M, const std::unique_ptr<ASTUnit> &Unit) {
20906f32e7eSjoerg    auto MB = M.bind("bindStr"); // Bind the to-be-matched node to a string key.
21006f32e7eSjoerg    auto MatchRes = match(MB, Unit->getASTContext());
21106f32e7eSjoerg    // We should have at least one match.
21206f32e7eSjoerg    assert(MatchRes.size() >= 1);
21306f32e7eSjoerg    // Get the first matched and bound node.
21406f32e7eSjoerg    Node *Result =
21506f32e7eSjoerg        const_cast<Node *>(MatchRes[0].template getNodeAs<Node>("bindStr"));
21606f32e7eSjoerg    assert(Result);
21706f32e7eSjoerg    return Result;
21806f32e7eSjoerg  }
21906f32e7eSjoerg
22006f32e7eSjoerg  int main() {
22106f32e7eSjoerg    std::unique_ptr<ASTUnit> ToUnit = buildASTFromCode(
22206f32e7eSjoerg        "", "to.cc");
22306f32e7eSjoerg    std::unique_ptr<ASTUnit> FromUnit = buildASTFromCode(
22406f32e7eSjoerg        R"(
22506f32e7eSjoerg        class MyClass {
22606f32e7eSjoerg          int m1;
22706f32e7eSjoerg          int m2;
22806f32e7eSjoerg        };
22906f32e7eSjoerg        )",
23006f32e7eSjoerg        "from.cc");
23106f32e7eSjoerg    auto Matcher = cxxRecordDecl(hasName("MyClass"));
23206f32e7eSjoerg    auto *From = getFirstDecl<CXXRecordDecl>(Matcher, FromUnit);
23306f32e7eSjoerg
23406f32e7eSjoerg    ASTImporter Importer(ToUnit->getASTContext(), ToUnit->getFileManager(),
23506f32e7eSjoerg                         FromUnit->getASTContext(), FromUnit->getFileManager(),
23606f32e7eSjoerg                         /*MinimalImport=*/true);
23706f32e7eSjoerg    llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From);
23806f32e7eSjoerg    if (!ImportedOrErr) {
23906f32e7eSjoerg      llvm::Error Err = ImportedOrErr.takeError();
24006f32e7eSjoerg      llvm::errs() << "ERROR: " << Err << "\n";
24106f32e7eSjoerg      consumeError(std::move(Err));
24206f32e7eSjoerg      return 1;
24306f32e7eSjoerg    }
24406f32e7eSjoerg    Decl *Imported = *ImportedOrErr;
24506f32e7eSjoerg    Imported->getTranslationUnitDecl()->dump();
24606f32e7eSjoerg
24706f32e7eSjoerg    if (llvm::Error Err = Importer.ImportDefinition(From)) {
24806f32e7eSjoerg      llvm::errs() << "ERROR: " << Err << "\n";
24906f32e7eSjoerg      consumeError(std::move(Err));
25006f32e7eSjoerg      return 1;
25106f32e7eSjoerg    }
25206f32e7eSjoerg    llvm::errs() << "Imported definition.\n";
25306f32e7eSjoerg    Imported->getTranslationUnitDecl()->dump();
25406f32e7eSjoerg
25506f32e7eSjoerg    return 0;
25606f32e7eSjoerg  };
25706f32e7eSjoerg
25806f32e7eSjoergWe may extend the ``CMakeLists.txt`` under let's say ``clang/tools`` with the build and link instructions:
25906f32e7eSjoerg
26006f32e7eSjoerg.. code-block:: bash
26106f32e7eSjoerg
26206f32e7eSjoerg  add_clang_executable(astimporter-demo ASTImporterDemo.cpp)
26306f32e7eSjoerg  clang_target_link_libraries(astimporter-demo
26406f32e7eSjoerg    PRIVATE
26506f32e7eSjoerg    LLVMSupport
26606f32e7eSjoerg    clangAST
26706f32e7eSjoerg    clangASTMatchers
26806f32e7eSjoerg    clangBasic
26906f32e7eSjoerg    clangFrontend
27006f32e7eSjoerg    clangSerialization
27106f32e7eSjoerg    clangTooling
27206f32e7eSjoerg    )
27306f32e7eSjoerg
27406f32e7eSjoergThen we can build and execute the new tool.
27506f32e7eSjoerg
27606f32e7eSjoerg.. code-block:: bash
27706f32e7eSjoerg
27806f32e7eSjoerg  $ ninja astimporter-demo && ./bin/astimporter-demo
27906f32e7eSjoerg
28006f32e7eSjoergErrors during the import process
28106f32e7eSjoerg^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28206f32e7eSjoerg
28306f32e7eSjoergNormally, either the source or the destination context contains the definition of a declaration.
28406f32e7eSjoergHowever, there may be cases when both of the contexts have a definition for a given symbol.
28506f32e7eSjoergIf these definitions differ, then we have a name conflict, in C++ it is known as ODR (one definition rule) violation.
28606f32e7eSjoergLet's modify the previous tool we had written and try to import a ``ClassTemplateSpecializationDecl`` with a conflicting definition:
28706f32e7eSjoerg
28806f32e7eSjoerg.. code-block:: cpp
28906f32e7eSjoerg
29006f32e7eSjoerg  int main() {
29106f32e7eSjoerg    std::unique_ptr<ASTUnit> ToUnit = buildASTFromCode(
29206f32e7eSjoerg        R"(
29306f32e7eSjoerg        // primary template
29406f32e7eSjoerg        template <typename T>
29506f32e7eSjoerg        struct X {};
29606f32e7eSjoerg        // explicit specialization
29706f32e7eSjoerg        template<>
29806f32e7eSjoerg        struct X<int> { int i; };
29906f32e7eSjoerg        )",
30006f32e7eSjoerg        "to.cc");
30106f32e7eSjoerg    ToUnit->enableSourceFileDiagnostics();
30206f32e7eSjoerg    std::unique_ptr<ASTUnit> FromUnit = buildASTFromCode(
30306f32e7eSjoerg        R"(
30406f32e7eSjoerg        // primary template
30506f32e7eSjoerg        template <typename T>
30606f32e7eSjoerg        struct X {};
30706f32e7eSjoerg        // explicit specialization
30806f32e7eSjoerg        template<>
30906f32e7eSjoerg        struct X<int> { int i2; };
31006f32e7eSjoerg        // field mismatch:  ^^
31106f32e7eSjoerg        )",
31206f32e7eSjoerg        "from.cc");
31306f32e7eSjoerg    FromUnit->enableSourceFileDiagnostics();
31406f32e7eSjoerg    auto Matcher = classTemplateSpecializationDecl(hasName("X"));
31506f32e7eSjoerg    auto *From = getFirstDecl<ClassTemplateSpecializationDecl>(Matcher, FromUnit);
31606f32e7eSjoerg    auto *To = getFirstDecl<ClassTemplateSpecializationDecl>(Matcher, ToUnit);
31706f32e7eSjoerg
31806f32e7eSjoerg    ASTImporter Importer(ToUnit->getASTContext(), ToUnit->getFileManager(),
31906f32e7eSjoerg                         FromUnit->getASTContext(), FromUnit->getFileManager(),
32006f32e7eSjoerg                         /*MinimalImport=*/false);
32106f32e7eSjoerg    llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From);
32206f32e7eSjoerg    if (!ImportedOrErr) {
32306f32e7eSjoerg      llvm::Error Err = ImportedOrErr.takeError();
32406f32e7eSjoerg      llvm::errs() << "ERROR: " << Err << "\n";
32506f32e7eSjoerg      consumeError(std::move(Err));
32606f32e7eSjoerg      To->getTranslationUnitDecl()->dump();
32706f32e7eSjoerg      return 1;
32806f32e7eSjoerg    }
32906f32e7eSjoerg    return 0;
33006f32e7eSjoerg  };
33106f32e7eSjoerg
33206f32e7eSjoergWhen we run the tool we have the following warning:
33306f32e7eSjoerg
33406f32e7eSjoerg.. code-block:: bash
33506f32e7eSjoerg
33606f32e7eSjoerg  to.cc:7:14: warning: type 'X<int>' has incompatible definitions in different translation units [-Wodr]
33706f32e7eSjoerg        struct X<int> { int i; };
33806f32e7eSjoerg               ^
33906f32e7eSjoerg  to.cc:7:27: note: field has name 'i' here
34006f32e7eSjoerg        struct X<int> { int i; };
34106f32e7eSjoerg                            ^
34206f32e7eSjoerg  from.cc:7:27: note: field has name 'i2' here
34306f32e7eSjoerg        struct X<int> { int i2; };
34406f32e7eSjoerg                          ^
34506f32e7eSjoerg
34606f32e7eSjoergNote, because of these diagnostics we had to call ``enableSourceFileDiagnostics`` on the ``ASTUnit`` objects.
34706f32e7eSjoerg
34806f32e7eSjoergSince we could not import the specified declaration (``From``), we get an error in the return value.
34906f32e7eSjoergThe AST does not contain the conflicting definition, so we are left with the original AST.
35006f32e7eSjoerg
35106f32e7eSjoerg.. code-block:: bash
35206f32e7eSjoerg
35306f32e7eSjoerg  ERROR: NameConflict
35406f32e7eSjoerg  TranslationUnitDecl 0xe54a48 <<invalid sloc>> <invalid sloc>
35506f32e7eSjoerg  |-ClassTemplateDecl 0xe91020 <to.cc:3:7, line:4:17> col:14 X
35606f32e7eSjoerg  | |-TemplateTypeParmDecl 0xe90ed0 <line:3:17, col:26> col:26 typename depth 0 index 0 T
35706f32e7eSjoerg  | |-CXXRecordDecl 0xe90f90 <line:4:7, col:17> col:14 struct X definition
35806f32e7eSjoerg  | | |-DefinitionData empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init
35906f32e7eSjoerg  | | | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr
36006f32e7eSjoerg  | | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
36106f32e7eSjoerg  | | | |-MoveConstructor exists simple trivial needs_implicit
36206f32e7eSjoerg  | | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
36306f32e7eSjoerg  | | | |-MoveAssignment exists simple trivial needs_implicit
36406f32e7eSjoerg  | | | `-Destructor simple irrelevant trivial needs_implicit
36506f32e7eSjoerg  | | `-CXXRecordDecl 0xe91270 <col:7, col:14> col:14 implicit struct X
36606f32e7eSjoerg  | `-ClassTemplateSpecialization 0xe91340 'X'
36706f32e7eSjoerg  `-ClassTemplateSpecializationDecl 0xe91340 <line:6:7, line:7:30> col:14 struct X definition
36806f32e7eSjoerg    |-DefinitionData pass_in_registers aggregate standard_layout trivially_copyable pod trivial literal
36906f32e7eSjoerg    | |-DefaultConstructor exists trivial needs_implicit
37006f32e7eSjoerg    | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
37106f32e7eSjoerg    | |-MoveConstructor exists simple trivial needs_implicit
37206f32e7eSjoerg    | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
37306f32e7eSjoerg    | |-MoveAssignment exists simple trivial needs_implicit
37406f32e7eSjoerg    | `-Destructor simple irrelevant trivial needs_implicit
37506f32e7eSjoerg    |-TemplateArgument type 'int'
37606f32e7eSjoerg    |-CXXRecordDecl 0xe91558 <col:7, col:14> col:14 implicit struct X
37706f32e7eSjoerg    `-FieldDecl 0xe91600 <col:23, col:27> col:27 i 'int'
37806f32e7eSjoerg
37906f32e7eSjoergError propagation
38006f32e7eSjoerg"""""""""""""""""
38106f32e7eSjoerg
38206f32e7eSjoergIf there is a dependent node we have to import before we could import a given node then the import error associated to the dependency propagates to the dependant node.
38306f32e7eSjoergLet's modify the previous example and import a ``FieldDecl`` instead of the ``ClassTemplateSpecializationDecl``.
38406f32e7eSjoerg
38506f32e7eSjoerg.. code-block:: cpp
38606f32e7eSjoerg
38706f32e7eSjoerg  auto Matcher = fieldDecl(hasName("i2"));
38806f32e7eSjoerg  auto *From = getFirstDecl<FieldDecl>(Matcher, FromUnit);
38906f32e7eSjoerg
39006f32e7eSjoergIn this case we can see that an error is associated (``getImportDeclErrorIfAny``) to the specialization also, not just to the field:
39106f32e7eSjoerg
39206f32e7eSjoerg.. code-block:: cpp
39306f32e7eSjoerg
39406f32e7eSjoerg  llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From);
39506f32e7eSjoerg  if (!ImportedOrErr) {
39606f32e7eSjoerg    llvm::Error Err = ImportedOrErr.takeError();
39706f32e7eSjoerg    consumeError(std::move(Err));
39806f32e7eSjoerg
39906f32e7eSjoerg    // check that the ClassTemplateSpecializationDecl is also marked as
40006f32e7eSjoerg    // erroneous.
40106f32e7eSjoerg    auto *FromSpec = getFirstDecl<ClassTemplateSpecializationDecl>(
40206f32e7eSjoerg        classTemplateSpecializationDecl(hasName("X")), FromUnit);
40306f32e7eSjoerg    assert(Importer.getImportDeclErrorIfAny(FromSpec));
40406f32e7eSjoerg    // Btw, the error is also set for the FieldDecl.
40506f32e7eSjoerg    assert(Importer.getImportDeclErrorIfAny(From));
40606f32e7eSjoerg    return 1;
40706f32e7eSjoerg  }
40806f32e7eSjoerg
40906f32e7eSjoergPolluted AST
41006f32e7eSjoerg""""""""""""
41106f32e7eSjoerg
41206f32e7eSjoergWe may recognize an error during the import of a dependent node. However, by that time, we had already created the dependant.
41306f32e7eSjoergIn these cases we do not remove the existing erroneous node from the "to" context, rather we associate an error to that node.
41406f32e7eSjoergLet's extend the previous example with another class ``Y``.
41506f32e7eSjoergThis class has a forward definition in the "to" context, but its definition is in the "from" context.
41606f32e7eSjoergWe'd like to import the definition, but it contains a member whose type conflicts with the type in the "to" context:
41706f32e7eSjoerg
41806f32e7eSjoerg.. code-block:: cpp
41906f32e7eSjoerg
42006f32e7eSjoerg  std::unique_ptr<ASTUnit> ToUnit = buildASTFromCode(
42106f32e7eSjoerg      R"(
42206f32e7eSjoerg      // primary template
42306f32e7eSjoerg      template <typename T>
42406f32e7eSjoerg      struct X {};
42506f32e7eSjoerg      // explicit specialization
42606f32e7eSjoerg      template<>
42706f32e7eSjoerg      struct X<int> { int i; };
42806f32e7eSjoerg
42906f32e7eSjoerg      class Y;
43006f32e7eSjoerg      )",
43106f32e7eSjoerg      "to.cc");
43206f32e7eSjoerg  ToUnit->enableSourceFileDiagnostics();
43306f32e7eSjoerg  std::unique_ptr<ASTUnit> FromUnit = buildASTFromCode(
43406f32e7eSjoerg      R"(
43506f32e7eSjoerg      // primary template
43606f32e7eSjoerg      template <typename T>
43706f32e7eSjoerg      struct X {};
43806f32e7eSjoerg      // explicit specialization
43906f32e7eSjoerg      template<>
44006f32e7eSjoerg      struct X<int> { int i2; };
44106f32e7eSjoerg      // field mismatch:  ^^
44206f32e7eSjoerg
44306f32e7eSjoerg      class Y { void f() { X<int> xi; } };
44406f32e7eSjoerg      )",
44506f32e7eSjoerg      "from.cc");
44606f32e7eSjoerg  FromUnit->enableSourceFileDiagnostics();
44706f32e7eSjoerg  auto Matcher = cxxRecordDecl(hasName("Y"));
44806f32e7eSjoerg  auto *From = getFirstDecl<CXXRecordDecl>(Matcher, FromUnit);
44906f32e7eSjoerg  auto *To = getFirstDecl<CXXRecordDecl>(Matcher, ToUnit);
45006f32e7eSjoerg
45106f32e7eSjoergThis time we create a shared_ptr for ``ASTImporterSharedState`` which owns the associated errors for the "to" context.
45206f32e7eSjoergNote, there may be several different ASTImporter objects which import into the same "to" context but from different "from" contexts; they should share the same ``ASTImporterSharedState``.
45306f32e7eSjoerg(Also note, we have to include the corresponding ``ASTImporterSharedState.h`` header file.)
45406f32e7eSjoerg
45506f32e7eSjoerg.. code-block:: cpp
45606f32e7eSjoerg
45706f32e7eSjoerg  auto ImporterState = std::make_shared<ASTImporterSharedState>();
45806f32e7eSjoerg  ASTImporter Importer(ToUnit->getASTContext(), ToUnit->getFileManager(),
45906f32e7eSjoerg                       FromUnit->getASTContext(), FromUnit->getFileManager(),
46006f32e7eSjoerg                       /*MinimalImport=*/false, ImporterState);
46106f32e7eSjoerg  llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From);
46206f32e7eSjoerg  if (!ImportedOrErr) {
46306f32e7eSjoerg    llvm::Error Err = ImportedOrErr.takeError();
46406f32e7eSjoerg    consumeError(std::move(Err));
46506f32e7eSjoerg
46606f32e7eSjoerg    // ... but the node had been created.
46706f32e7eSjoerg    auto *ToYDef = getFirstDecl<CXXRecordDecl>(
46806f32e7eSjoerg        cxxRecordDecl(hasName("Y"), isDefinition()), ToUnit);
46906f32e7eSjoerg    ToYDef->dump();
47006f32e7eSjoerg    // An error is set for "ToYDef" in the shared state.
47106f32e7eSjoerg    Optional<ImportError> OptErr =
47206f32e7eSjoerg        ImporterState->getImportDeclErrorIfAny(ToYDef);
47306f32e7eSjoerg    assert(OptErr);
47406f32e7eSjoerg
47506f32e7eSjoerg    return 1;
47606f32e7eSjoerg  }
47706f32e7eSjoerg
47806f32e7eSjoergIf we take a look at the AST, then we can see that the Decl with the definition is created, but the field is missing.
47906f32e7eSjoerg
48006f32e7eSjoerg.. code-block:: bash
48106f32e7eSjoerg
48206f32e7eSjoerg  |-CXXRecordDecl 0xf66678 <line:9:7, col:13> col:13 class Y
48306f32e7eSjoerg  `-CXXRecordDecl 0xf66730 prev 0xf66678 <:10:7, col:13> col:13 class Y definition
48406f32e7eSjoerg    |-DefinitionData pass_in_registers empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init
48506f32e7eSjoerg    | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr
48606f32e7eSjoerg    | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
48706f32e7eSjoerg    | |-MoveConstructor exists simple trivial needs_implicit
48806f32e7eSjoerg    | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
48906f32e7eSjoerg    | |-MoveAssignment exists simple trivial needs_implicit
49006f32e7eSjoerg    | `-Destructor simple irrelevant trivial needs_implicit
49106f32e7eSjoerg    `-CXXRecordDecl 0xf66828 <col:7, col:13> col:13 implicit class Y
49206f32e7eSjoerg
49306f32e7eSjoergWe do not remove the erroneous nodes because by the time when we recognize the error it is too late to remove the node, there may be additional references to that already in the AST.
49406f32e7eSjoergThis is aligned with the overall `design principle of the Clang AST <InternalsManual.html#immutability>`_: Clang AST nodes (types, declarations, statements, expressions, and so on) are generally designed to be **immutable once created**.
49506f32e7eSjoergThus, clients of the ASTImporter library should always check if there is any associated error for the node which they inspect in the destination context.
49606f32e7eSjoergWe recommend skipping the processing of those nodes which have an error associated with them.
49706f32e7eSjoerg
49806f32e7eSjoergUsing the ``-ast-merge`` Clang front-end action
49906f32e7eSjoerg-----------------------------------------------
50006f32e7eSjoerg
50106f32e7eSjoergThe ``-ast-merge <pch-file>`` command-line switch can be used to merge from the given serialized AST file.
50206f32e7eSjoergThis file represents the source context.
50306f32e7eSjoergWhen this switch is present then each top-level AST node of the source context is being merged into the destination context.
50406f32e7eSjoergIf the merge was successful then ``ASTConsumer::HandleTopLevelDecl`` is called for the Decl.
50506f32e7eSjoergThis results that we can execute the original front-end action on the extended AST.
50606f32e7eSjoerg
50706f32e7eSjoergExample for C
50806f32e7eSjoerg^^^^^^^^^^^^^
50906f32e7eSjoerg
51006f32e7eSjoergLet's consider the following three files:
51106f32e7eSjoerg
51206f32e7eSjoerg.. code-block:: c
51306f32e7eSjoerg
51406f32e7eSjoerg  // bar.h
51506f32e7eSjoerg  #ifndef BAR_H
51606f32e7eSjoerg  #define BAR_H
51706f32e7eSjoerg  int bar();
51806f32e7eSjoerg  #endif /* BAR_H */
51906f32e7eSjoerg
52006f32e7eSjoerg  // bar.c
52106f32e7eSjoerg  #include "bar.h"
52206f32e7eSjoerg  int bar() {
52306f32e7eSjoerg    return 41;
52406f32e7eSjoerg  }
52506f32e7eSjoerg
52606f32e7eSjoerg  // main.c
52706f32e7eSjoerg  #include "bar.h"
52806f32e7eSjoerg  int main() {
52906f32e7eSjoerg      return bar();
53006f32e7eSjoerg  }
53106f32e7eSjoerg
53206f32e7eSjoergLet's generate the AST files for the two source files:
53306f32e7eSjoerg
53406f32e7eSjoerg.. code-block:: bash
53506f32e7eSjoerg
53606f32e7eSjoerg  $ clang -cc1 -emit-pch -o bar.ast bar.c
53706f32e7eSjoerg  $ clang -cc1 -emit-pch -o main.ast main.c
53806f32e7eSjoerg
53906f32e7eSjoergThen, let's check how the merged AST would look like if we consider only the ``bar()`` function:
54006f32e7eSjoerg
54106f32e7eSjoerg.. code-block:: bash
54206f32e7eSjoerg
54306f32e7eSjoerg  $ clang -cc1 -ast-merge bar.ast -ast-merge main.ast /dev/null -ast-dump
54406f32e7eSjoerg  TranslationUnitDecl 0x12b0738 <<invalid sloc>> <invalid sloc>
54506f32e7eSjoerg  |-FunctionDecl 0x12b1470 </path/bar.h:4:1, col:9> col:5 used bar 'int ()'
54606f32e7eSjoerg  |-FunctionDecl 0x12b1538 prev 0x12b1470 </path/bar.c:3:1, line:5:1> line:3:5 used bar 'int ()'
54706f32e7eSjoerg  | `-CompoundStmt 0x12b1608 <col:11, line:5:1>
54806f32e7eSjoerg  |   `-ReturnStmt 0x12b15f8 <line:4:3, col:10>
54906f32e7eSjoerg  |     `-IntegerLiteral 0x12b15d8 <col:10> 'int' 41
55006f32e7eSjoerg  |-FunctionDecl 0x12b1648 prev 0x12b1538 </path/bar.h:4:1, col:9> col:5 used bar 'int ()'
55106f32e7eSjoerg
55206f32e7eSjoergWe can inspect that the prototype of the function and the definition of it is merged into the same redeclaration chain.
55306f32e7eSjoergWhat's more there is a third prototype declaration merged to the chain.
55406f32e7eSjoergThe functions are merged in a way that prototypes are added to the redecl chain if they refer to the same type, but we can have only one definition.
55506f32e7eSjoergThe first two declarations are from ``bar.ast``, the third is from ``main.ast``.
55606f32e7eSjoerg
55706f32e7eSjoergNow, let's create an object file from the merged AST:
55806f32e7eSjoerg
55906f32e7eSjoerg.. code-block:: bash
56006f32e7eSjoerg
56106f32e7eSjoerg  $ clang -cc1 -ast-merge bar.ast -ast-merge main.ast /dev/null -emit-obj -o main.o
56206f32e7eSjoerg
56306f32e7eSjoergNext, we may call the linker and execute the created binary file.
56406f32e7eSjoerg
56506f32e7eSjoerg.. code-block:: bash
56606f32e7eSjoerg
56706f32e7eSjoerg  $ clang -o a.out main.o
56806f32e7eSjoerg  $ ./a.out
56906f32e7eSjoerg  $ echo $?
57006f32e7eSjoerg  41
57106f32e7eSjoerg  $
57206f32e7eSjoerg
57306f32e7eSjoergExample for C++
57406f32e7eSjoerg^^^^^^^^^^^^^^^
57506f32e7eSjoerg
57606f32e7eSjoergIn the case of C++, the generation of the AST files and the way how we invoke the front-end is a bit different.
57706f32e7eSjoergAssuming we have these three files:
57806f32e7eSjoerg
57906f32e7eSjoerg.. code-block:: cpp
58006f32e7eSjoerg
58106f32e7eSjoerg  // foo.h
58206f32e7eSjoerg  #ifndef FOO_H
58306f32e7eSjoerg  #define FOO_H
58406f32e7eSjoerg  struct foo {
58506f32e7eSjoerg      virtual int fun();
58606f32e7eSjoerg  };
58706f32e7eSjoerg  #endif /* FOO_H */
58806f32e7eSjoerg
58906f32e7eSjoerg  // foo.cpp
59006f32e7eSjoerg  #include "foo.h"
59106f32e7eSjoerg  int foo::fun() {
59206f32e7eSjoerg    return 42;
59306f32e7eSjoerg  }
59406f32e7eSjoerg
59506f32e7eSjoerg  // main.cpp
59606f32e7eSjoerg  #include "foo.h"
59706f32e7eSjoerg  int main() {
59806f32e7eSjoerg      return foo().fun();
59906f32e7eSjoerg  }
60006f32e7eSjoerg
60106f32e7eSjoergWe shall generate the AST files, merge them, create the executable and then run it:
60206f32e7eSjoerg
60306f32e7eSjoerg.. code-block:: bash
60406f32e7eSjoerg
60506f32e7eSjoerg  $ clang++ -x c++-header -o foo.ast foo.cpp
60606f32e7eSjoerg  $ clang++ -x c++-header -o main.ast main.cpp
60706f32e7eSjoerg  $ clang++ -cc1 -x c++ -ast-merge foo.ast -ast-merge main.ast /dev/null -ast-dump
60806f32e7eSjoerg  $ clang++ -cc1 -x c++ -ast-merge foo.ast -ast-merge main.ast /dev/null -emit-obj -o main.o
60906f32e7eSjoerg  $ clang++ -o a.out main.o
61006f32e7eSjoerg  $ ./a.out
61106f32e7eSjoerg  $ echo $?
61206f32e7eSjoerg  42
61306f32e7eSjoerg  $
614