106f32e7eSjoerg=============================== 206f32e7eSjoergASTImporter: Merging Clang ASTs 306f32e7eSjoerg=============================== 406f32e7eSjoerg 506f32e7eSjoergThe ``ASTImporter`` class is part of Clang's core library, the AST library. 606f32e7eSjoergIt imports nodes of an ``ASTContext`` into another ``ASTContext``. 706f32e7eSjoerg 806f32e7eSjoergIn this document, we assume basic knowledge about the Clang AST. See the :doc:`Introduction 906f32e7eSjoergto the Clang AST <IntroductionToTheClangAST>` if you want to learn more 1006f32e7eSjoergabout how the AST is structured. 1106f32e7eSjoergKnowledge about :doc:`matching the Clang AST <LibASTMatchers>` and the `reference for the matchers <https://clang.llvm.org/docs/LibASTMatchersReference.html>`_ are also useful. 1206f32e7eSjoerg 1306f32e7eSjoerg.. contents:: 1406f32e7eSjoerg :local: 1506f32e7eSjoerg 1606f32e7eSjoergIntroduction 1706f32e7eSjoerg------------ 1806f32e7eSjoerg 1906f32e7eSjoerg``ASTContext`` holds long-lived AST nodes (such as types and decls) that can be referred to throughout the semantic analysis of a file. 2006f32e7eSjoergIn some cases it is preferable to work with more than one ``ASTContext``. 2106f32e7eSjoergFor example, we'd like to parse multiple different files inside the same Clang tool. 2206f32e7eSjoergIt may be convenient if we could view the set of the resulting ASTs as if they were one AST resulting from the parsing of each file together. 2306f32e7eSjoerg``ASTImporter`` provides the way to copy types or declarations from one ``ASTContext`` to another. 2406f32e7eSjoergWe refer to the context from which we import as the **"from" context** or *source context*; and the context into which we import as the **"to" context** or *destination context*. 2506f32e7eSjoerg 2606f32e7eSjoergExisting clients of the ``ASTImporter`` library are Cross Translation Unit (CTU) static analysis and the LLDB expression parser. 2706f32e7eSjoergCTU static analysis imports a definition of a function if its definition is found in another translation unit (TU). 2806f32e7eSjoergThis way the analysis can breach out from the single TU limitation. 2906f32e7eSjoergLLDB's ``expr`` command parses a user-defined expression, creates an ``ASTContext`` for that and then imports the missing definitions from the AST what we got from the debug information (DWARF, etc). 3006f32e7eSjoerg 3106f32e7eSjoergAlgorithm of the import 3206f32e7eSjoerg----------------------- 3306f32e7eSjoerg 3406f32e7eSjoergImporting one AST node copies that node into the destination ``ASTContext``. 3506f32e7eSjoergWhy do we have to copy the node? 3606f32e7eSjoergIsn't enough to insert the pointer to that node into the destination context? 3706f32e7eSjoergOne reason is that the "from" context may outlive the "to" context. 3806f32e7eSjoergAlso, the Clang AST consider nodes (or certain properties of nodes) equivalent if they have the same address! 3906f32e7eSjoerg 4006f32e7eSjoergThe import algorithm has to ensure that the structurally equivalent nodes in the different translation units are not getting duplicated in the merged AST. 4106f32e7eSjoergE.g. if we include the definition of the vector template (``#include <vector>``) in two translation units, then their merged AST should have only one node which represents the template. 4206f32e7eSjoergAlso, we have to discover *one definition rule* (ODR) violations. 4306f32e7eSjoergFor instance, if there is a class definition with the same name in both translation units, but one of the definition contains a different number of fields. 4406f32e7eSjoergSo, we look up existing definitions, and then we check the structural equivalency on those nodes. 4506f32e7eSjoergThe following pseudo-code demonstrates the basics of the import mechanism: 4606f32e7eSjoerg 4706f32e7eSjoerg.. code-block:: cpp 4806f32e7eSjoerg 4906f32e7eSjoerg // Pseudo-code(!) of import: 5006f32e7eSjoerg ErrorOrDecl Import(Decl *FromD) { 5106f32e7eSjoerg Decl *ToDecl = nullptr; 5206f32e7eSjoerg FoundDeclsList = Look up all Decls in the "to" Ctx with the same name of FromD; 5306f32e7eSjoerg for (auto FoundDecl : FoundDeclsList) { 5406f32e7eSjoerg if (StructurallyEquivalentDecls(FoundDecl, FromD)) { 5506f32e7eSjoerg ToDecl = FoundDecl; 5606f32e7eSjoerg Mark FromD as imported; 5706f32e7eSjoerg break; 5806f32e7eSjoerg } else { 5906f32e7eSjoerg Report ODR violation; 6006f32e7eSjoerg return error; 6106f32e7eSjoerg } 6206f32e7eSjoerg } 6306f32e7eSjoerg if (FoundDeclsList is empty) { 6406f32e7eSjoerg Import dependent declarations and types of ToDecl; 6506f32e7eSjoerg ToDecl = create a new AST node in "to" Ctx; 6606f32e7eSjoerg Mark FromD as imported; 6706f32e7eSjoerg } 6806f32e7eSjoerg return ToDecl; 6906f32e7eSjoerg } 7006f32e7eSjoerg 7106f32e7eSjoergTwo AST nodes are *structurally equivalent* if they are 7206f32e7eSjoerg 7306f32e7eSjoerg- builtin types and refer to the same type, e.g. ``int`` and ``int`` are structurally equivalent, 7406f32e7eSjoerg- function types and all their parameters have structurally equivalent types, 7506f32e7eSjoerg- record types and all their fields in order of their definition have the same identifier names and structurally equivalent types, 7606f32e7eSjoerg- variable or function declarations and they have the same identifier name and their types are structurally equivalent. 7706f32e7eSjoerg 7806f32e7eSjoergWe could extend the definition of structural equivalency to templates similarly. 7906f32e7eSjoerg 8006f32e7eSjoergIf A and B are AST nodes and *A depends on B*, then we say that A is a **dependant** of B and B is a **dependency** of A. 8106f32e7eSjoergThe words "dependant" and "dependency" are nouns in British English. 8206f32e7eSjoergUnfortunately, in American English, the adjective "dependent" is used for both meanings. 8306f32e7eSjoergIn this document, with the "dependent" adjective we always address the dependencies, the B node in the example. 8406f32e7eSjoerg 8506f32e7eSjoergAPI 8606f32e7eSjoerg--- 8706f32e7eSjoerg 8806f32e7eSjoergLet's create a tool which uses the ASTImporter class! 8906f32e7eSjoergFirst, we build two ASTs from virtual files; the content of the virtual files are synthesized from string literals: 9006f32e7eSjoerg 9106f32e7eSjoerg.. code-block:: cpp 9206f32e7eSjoerg 9306f32e7eSjoerg std::unique_ptr<ASTUnit> ToUnit = buildASTFromCode( 9406f32e7eSjoerg "", "to.cc"); // empty file 9506f32e7eSjoerg std::unique_ptr<ASTUnit> FromUnit = buildASTFromCode( 9606f32e7eSjoerg R"( 9706f32e7eSjoerg class MyClass { 9806f32e7eSjoerg int m1; 9906f32e7eSjoerg int m2; 10006f32e7eSjoerg }; 10106f32e7eSjoerg )", 10206f32e7eSjoerg "from.cc"); 10306f32e7eSjoerg 10406f32e7eSjoergThe first AST corresponds to the destination ("to") context - which is empty - and the second for the source ("from") context. 10506f32e7eSjoergNext, we define a matcher to match ``MyClass`` in the "from" context: 10606f32e7eSjoerg 10706f32e7eSjoerg.. code-block:: cpp 10806f32e7eSjoerg 10906f32e7eSjoerg auto Matcher = cxxRecordDecl(hasName("MyClass")); 11006f32e7eSjoerg auto *From = getFirstDecl<CXXRecordDecl>(Matcher, FromUnit); 11106f32e7eSjoerg 11206f32e7eSjoergNow we create the Importer and do the import: 11306f32e7eSjoerg 11406f32e7eSjoerg.. code-block:: cpp 11506f32e7eSjoerg 11606f32e7eSjoerg ASTImporter Importer(ToUnit->getASTContext(), ToUnit->getFileManager(), 11706f32e7eSjoerg FromUnit->getASTContext(), FromUnit->getFileManager(), 11806f32e7eSjoerg /*MinimalImport=*/true); 11906f32e7eSjoerg llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From); 12006f32e7eSjoerg 12106f32e7eSjoergThe ``Import`` call returns with ``llvm::Expected``, so, we must check for any error. 122*13fbcb42SjoergPlease refer to the `error handling <https://llvm.org/docs/ProgrammersManual.html#recoverable-errors>`_ documentation for details. 12306f32e7eSjoerg 12406f32e7eSjoerg.. code-block:: cpp 12506f32e7eSjoerg 12606f32e7eSjoerg if (!ImportedOrErr) { 12706f32e7eSjoerg llvm::Error Err = ImportedOrErr.takeError(); 12806f32e7eSjoerg llvm::errs() << "ERROR: " << Err << "\n"; 12906f32e7eSjoerg consumeError(std::move(Err)); 13006f32e7eSjoerg return 1; 13106f32e7eSjoerg } 13206f32e7eSjoerg 13306f32e7eSjoergIf there's no error then we can get the underlying value. 13406f32e7eSjoergIn this example we will print the AST of the "to" context. 13506f32e7eSjoerg 13606f32e7eSjoerg.. code-block:: cpp 13706f32e7eSjoerg 13806f32e7eSjoerg Decl *Imported = *ImportedOrErr; 13906f32e7eSjoerg Imported->getTranslationUnitDecl()->dump(); 14006f32e7eSjoerg 14106f32e7eSjoergSince we set **minimal import** in the constructor of the importer, the AST will not contain the declaration of the members (once we run the test tool). 14206f32e7eSjoerg 14306f32e7eSjoerg.. code-block:: bash 14406f32e7eSjoerg 14506f32e7eSjoerg TranslationUnitDecl 0x68b9a8 <<invalid sloc>> <invalid sloc> 14606f32e7eSjoerg `-CXXRecordDecl 0x6c7e30 <line:2:7, col:13> col:13 class MyClass definition 14706f32e7eSjoerg `-DefinitionData pass_in_registers standard_layout trivially_copyable trivial literal 14806f32e7eSjoerg |-DefaultConstructor exists trivial needs_implicit 14906f32e7eSjoerg |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param 15006f32e7eSjoerg |-MoveConstructor exists simple trivial needs_implicit 15106f32e7eSjoerg |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param 15206f32e7eSjoerg |-MoveAssignment exists simple trivial needs_implicit 15306f32e7eSjoerg `-Destructor simple irrelevant trivial needs_implicit 15406f32e7eSjoerg 15506f32e7eSjoergWe'd like to get the members too, so, we use ``ImportDefinition`` to copy the whole definition of ``MyClass`` into the "to" context. 15606f32e7eSjoergThen we dump the AST again. 15706f32e7eSjoerg 15806f32e7eSjoerg.. code-block:: cpp 15906f32e7eSjoerg 16006f32e7eSjoerg if (llvm::Error Err = Importer.ImportDefinition(From)) { 16106f32e7eSjoerg llvm::errs() << "ERROR: " << Err << "\n"; 16206f32e7eSjoerg consumeError(std::move(Err)); 16306f32e7eSjoerg return 1; 16406f32e7eSjoerg } 16506f32e7eSjoerg llvm::errs() << "Imported definition.\n"; 16606f32e7eSjoerg Imported->getTranslationUnitDecl()->dump(); 16706f32e7eSjoerg 16806f32e7eSjoergThis time the AST is going to contain the members too. 16906f32e7eSjoerg 17006f32e7eSjoerg.. code-block:: bash 17106f32e7eSjoerg 17206f32e7eSjoerg TranslationUnitDecl 0x68b9a8 <<invalid sloc>> <invalid sloc> 17306f32e7eSjoerg `-CXXRecordDecl 0x6c7e30 <line:2:7, col:13> col:13 class MyClass definition 17406f32e7eSjoerg |-DefinitionData pass_in_registers standard_layout trivially_copyable trivial literal 17506f32e7eSjoerg | |-DefaultConstructor exists trivial needs_implicit 17606f32e7eSjoerg | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param 17706f32e7eSjoerg | |-MoveConstructor exists simple trivial needs_implicit 17806f32e7eSjoerg | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param 17906f32e7eSjoerg | |-MoveAssignment exists simple trivial needs_implicit 18006f32e7eSjoerg | `-Destructor simple irrelevant trivial needs_implicit 18106f32e7eSjoerg |-CXXRecordDecl 0x6c7f48 <col:7, col:13> col:13 implicit class MyClass 18206f32e7eSjoerg |-FieldDecl 0x6c7ff0 <line:3:9, col:13> col:13 m1 'int' 18306f32e7eSjoerg `-FieldDecl 0x6c8058 <line:4:9, col:13> col:13 m2 'int' 18406f32e7eSjoerg 18506f32e7eSjoergWe can spare the call for ``ImportDefinition`` if we set up the importer to do a "normal" (not minimal) import. 18606f32e7eSjoerg 18706f32e7eSjoerg.. code-block:: cpp 18806f32e7eSjoerg 18906f32e7eSjoerg ASTImporter Importer( .... /*MinimalImport=*/false); 19006f32e7eSjoerg 19106f32e7eSjoergWith **normal import**, all dependent declarations are imported normally. 19206f32e7eSjoergHowever, with minimal import, the dependent Decls are imported without definition, and we have to import their definition for each if we later need that. 19306f32e7eSjoerg 19406f32e7eSjoergPutting this all together here is how the source of the tool looks like: 19506f32e7eSjoerg 19606f32e7eSjoerg.. code-block:: cpp 19706f32e7eSjoerg 19806f32e7eSjoerg #include "clang/AST/ASTImporter.h" 19906f32e7eSjoerg #include "clang/ASTMatchers/ASTMatchFinder.h" 20006f32e7eSjoerg #include "clang/ASTMatchers/ASTMatchers.h" 20106f32e7eSjoerg #include "clang/Tooling/Tooling.h" 20206f32e7eSjoerg 20306f32e7eSjoerg using namespace clang; 20406f32e7eSjoerg using namespace tooling; 20506f32e7eSjoerg using namespace ast_matchers; 20606f32e7eSjoerg 20706f32e7eSjoerg template <typename Node, typename Matcher> 20806f32e7eSjoerg Node *getFirstDecl(Matcher M, const std::unique_ptr<ASTUnit> &Unit) { 20906f32e7eSjoerg auto MB = M.bind("bindStr"); // Bind the to-be-matched node to a string key. 21006f32e7eSjoerg auto MatchRes = match(MB, Unit->getASTContext()); 21106f32e7eSjoerg // We should have at least one match. 21206f32e7eSjoerg assert(MatchRes.size() >= 1); 21306f32e7eSjoerg // Get the first matched and bound node. 21406f32e7eSjoerg Node *Result = 21506f32e7eSjoerg const_cast<Node *>(MatchRes[0].template getNodeAs<Node>("bindStr")); 21606f32e7eSjoerg assert(Result); 21706f32e7eSjoerg return Result; 21806f32e7eSjoerg } 21906f32e7eSjoerg 22006f32e7eSjoerg int main() { 22106f32e7eSjoerg std::unique_ptr<ASTUnit> ToUnit = buildASTFromCode( 22206f32e7eSjoerg "", "to.cc"); 22306f32e7eSjoerg std::unique_ptr<ASTUnit> FromUnit = buildASTFromCode( 22406f32e7eSjoerg R"( 22506f32e7eSjoerg class MyClass { 22606f32e7eSjoerg int m1; 22706f32e7eSjoerg int m2; 22806f32e7eSjoerg }; 22906f32e7eSjoerg )", 23006f32e7eSjoerg "from.cc"); 23106f32e7eSjoerg auto Matcher = cxxRecordDecl(hasName("MyClass")); 23206f32e7eSjoerg auto *From = getFirstDecl<CXXRecordDecl>(Matcher, FromUnit); 23306f32e7eSjoerg 23406f32e7eSjoerg ASTImporter Importer(ToUnit->getASTContext(), ToUnit->getFileManager(), 23506f32e7eSjoerg FromUnit->getASTContext(), FromUnit->getFileManager(), 23606f32e7eSjoerg /*MinimalImport=*/true); 23706f32e7eSjoerg llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From); 23806f32e7eSjoerg if (!ImportedOrErr) { 23906f32e7eSjoerg llvm::Error Err = ImportedOrErr.takeError(); 24006f32e7eSjoerg llvm::errs() << "ERROR: " << Err << "\n"; 24106f32e7eSjoerg consumeError(std::move(Err)); 24206f32e7eSjoerg return 1; 24306f32e7eSjoerg } 24406f32e7eSjoerg Decl *Imported = *ImportedOrErr; 24506f32e7eSjoerg Imported->getTranslationUnitDecl()->dump(); 24606f32e7eSjoerg 24706f32e7eSjoerg if (llvm::Error Err = Importer.ImportDefinition(From)) { 24806f32e7eSjoerg llvm::errs() << "ERROR: " << Err << "\n"; 24906f32e7eSjoerg consumeError(std::move(Err)); 25006f32e7eSjoerg return 1; 25106f32e7eSjoerg } 25206f32e7eSjoerg llvm::errs() << "Imported definition.\n"; 25306f32e7eSjoerg Imported->getTranslationUnitDecl()->dump(); 25406f32e7eSjoerg 25506f32e7eSjoerg return 0; 25606f32e7eSjoerg }; 25706f32e7eSjoerg 25806f32e7eSjoergWe may extend the ``CMakeLists.txt`` under let's say ``clang/tools`` with the build and link instructions: 25906f32e7eSjoerg 26006f32e7eSjoerg.. code-block:: bash 26106f32e7eSjoerg 26206f32e7eSjoerg add_clang_executable(astimporter-demo ASTImporterDemo.cpp) 26306f32e7eSjoerg clang_target_link_libraries(astimporter-demo 26406f32e7eSjoerg PRIVATE 26506f32e7eSjoerg LLVMSupport 26606f32e7eSjoerg clangAST 26706f32e7eSjoerg clangASTMatchers 26806f32e7eSjoerg clangBasic 26906f32e7eSjoerg clangFrontend 27006f32e7eSjoerg clangSerialization 27106f32e7eSjoerg clangTooling 27206f32e7eSjoerg ) 27306f32e7eSjoerg 27406f32e7eSjoergThen we can build and execute the new tool. 27506f32e7eSjoerg 27606f32e7eSjoerg.. code-block:: bash 27706f32e7eSjoerg 27806f32e7eSjoerg $ ninja astimporter-demo && ./bin/astimporter-demo 27906f32e7eSjoerg 28006f32e7eSjoergErrors during the import process 28106f32e7eSjoerg^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28206f32e7eSjoerg 28306f32e7eSjoergNormally, either the source or the destination context contains the definition of a declaration. 28406f32e7eSjoergHowever, there may be cases when both of the contexts have a definition for a given symbol. 28506f32e7eSjoergIf these definitions differ, then we have a name conflict, in C++ it is known as ODR (one definition rule) violation. 28606f32e7eSjoergLet's modify the previous tool we had written and try to import a ``ClassTemplateSpecializationDecl`` with a conflicting definition: 28706f32e7eSjoerg 28806f32e7eSjoerg.. code-block:: cpp 28906f32e7eSjoerg 29006f32e7eSjoerg int main() { 29106f32e7eSjoerg std::unique_ptr<ASTUnit> ToUnit = buildASTFromCode( 29206f32e7eSjoerg R"( 29306f32e7eSjoerg // primary template 29406f32e7eSjoerg template <typename T> 29506f32e7eSjoerg struct X {}; 29606f32e7eSjoerg // explicit specialization 29706f32e7eSjoerg template<> 29806f32e7eSjoerg struct X<int> { int i; }; 29906f32e7eSjoerg )", 30006f32e7eSjoerg "to.cc"); 30106f32e7eSjoerg ToUnit->enableSourceFileDiagnostics(); 30206f32e7eSjoerg std::unique_ptr<ASTUnit> FromUnit = buildASTFromCode( 30306f32e7eSjoerg R"( 30406f32e7eSjoerg // primary template 30506f32e7eSjoerg template <typename T> 30606f32e7eSjoerg struct X {}; 30706f32e7eSjoerg // explicit specialization 30806f32e7eSjoerg template<> 30906f32e7eSjoerg struct X<int> { int i2; }; 31006f32e7eSjoerg // field mismatch: ^^ 31106f32e7eSjoerg )", 31206f32e7eSjoerg "from.cc"); 31306f32e7eSjoerg FromUnit->enableSourceFileDiagnostics(); 31406f32e7eSjoerg auto Matcher = classTemplateSpecializationDecl(hasName("X")); 31506f32e7eSjoerg auto *From = getFirstDecl<ClassTemplateSpecializationDecl>(Matcher, FromUnit); 31606f32e7eSjoerg auto *To = getFirstDecl<ClassTemplateSpecializationDecl>(Matcher, ToUnit); 31706f32e7eSjoerg 31806f32e7eSjoerg ASTImporter Importer(ToUnit->getASTContext(), ToUnit->getFileManager(), 31906f32e7eSjoerg FromUnit->getASTContext(), FromUnit->getFileManager(), 32006f32e7eSjoerg /*MinimalImport=*/false); 32106f32e7eSjoerg llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From); 32206f32e7eSjoerg if (!ImportedOrErr) { 32306f32e7eSjoerg llvm::Error Err = ImportedOrErr.takeError(); 32406f32e7eSjoerg llvm::errs() << "ERROR: " << Err << "\n"; 32506f32e7eSjoerg consumeError(std::move(Err)); 32606f32e7eSjoerg To->getTranslationUnitDecl()->dump(); 32706f32e7eSjoerg return 1; 32806f32e7eSjoerg } 32906f32e7eSjoerg return 0; 33006f32e7eSjoerg }; 33106f32e7eSjoerg 33206f32e7eSjoergWhen we run the tool we have the following warning: 33306f32e7eSjoerg 33406f32e7eSjoerg.. code-block:: bash 33506f32e7eSjoerg 33606f32e7eSjoerg to.cc:7:14: warning: type 'X<int>' has incompatible definitions in different translation units [-Wodr] 33706f32e7eSjoerg struct X<int> { int i; }; 33806f32e7eSjoerg ^ 33906f32e7eSjoerg to.cc:7:27: note: field has name 'i' here 34006f32e7eSjoerg struct X<int> { int i; }; 34106f32e7eSjoerg ^ 34206f32e7eSjoerg from.cc:7:27: note: field has name 'i2' here 34306f32e7eSjoerg struct X<int> { int i2; }; 34406f32e7eSjoerg ^ 34506f32e7eSjoerg 34606f32e7eSjoergNote, because of these diagnostics we had to call ``enableSourceFileDiagnostics`` on the ``ASTUnit`` objects. 34706f32e7eSjoerg 34806f32e7eSjoergSince we could not import the specified declaration (``From``), we get an error in the return value. 34906f32e7eSjoergThe AST does not contain the conflicting definition, so we are left with the original AST. 35006f32e7eSjoerg 35106f32e7eSjoerg.. code-block:: bash 35206f32e7eSjoerg 35306f32e7eSjoerg ERROR: NameConflict 35406f32e7eSjoerg TranslationUnitDecl 0xe54a48 <<invalid sloc>> <invalid sloc> 35506f32e7eSjoerg |-ClassTemplateDecl 0xe91020 <to.cc:3:7, line:4:17> col:14 X 35606f32e7eSjoerg | |-TemplateTypeParmDecl 0xe90ed0 <line:3:17, col:26> col:26 typename depth 0 index 0 T 35706f32e7eSjoerg | |-CXXRecordDecl 0xe90f90 <line:4:7, col:17> col:14 struct X definition 35806f32e7eSjoerg | | |-DefinitionData empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init 35906f32e7eSjoerg | | | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr 36006f32e7eSjoerg | | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param 36106f32e7eSjoerg | | | |-MoveConstructor exists simple trivial needs_implicit 36206f32e7eSjoerg | | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param 36306f32e7eSjoerg | | | |-MoveAssignment exists simple trivial needs_implicit 36406f32e7eSjoerg | | | `-Destructor simple irrelevant trivial needs_implicit 36506f32e7eSjoerg | | `-CXXRecordDecl 0xe91270 <col:7, col:14> col:14 implicit struct X 36606f32e7eSjoerg | `-ClassTemplateSpecialization 0xe91340 'X' 36706f32e7eSjoerg `-ClassTemplateSpecializationDecl 0xe91340 <line:6:7, line:7:30> col:14 struct X definition 36806f32e7eSjoerg |-DefinitionData pass_in_registers aggregate standard_layout trivially_copyable pod trivial literal 36906f32e7eSjoerg | |-DefaultConstructor exists trivial needs_implicit 37006f32e7eSjoerg | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param 37106f32e7eSjoerg | |-MoveConstructor exists simple trivial needs_implicit 37206f32e7eSjoerg | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param 37306f32e7eSjoerg | |-MoveAssignment exists simple trivial needs_implicit 37406f32e7eSjoerg | `-Destructor simple irrelevant trivial needs_implicit 37506f32e7eSjoerg |-TemplateArgument type 'int' 37606f32e7eSjoerg |-CXXRecordDecl 0xe91558 <col:7, col:14> col:14 implicit struct X 37706f32e7eSjoerg `-FieldDecl 0xe91600 <col:23, col:27> col:27 i 'int' 37806f32e7eSjoerg 37906f32e7eSjoergError propagation 38006f32e7eSjoerg""""""""""""""""" 38106f32e7eSjoerg 38206f32e7eSjoergIf there is a dependent node we have to import before we could import a given node then the import error associated to the dependency propagates to the dependant node. 38306f32e7eSjoergLet's modify the previous example and import a ``FieldDecl`` instead of the ``ClassTemplateSpecializationDecl``. 38406f32e7eSjoerg 38506f32e7eSjoerg.. code-block:: cpp 38606f32e7eSjoerg 38706f32e7eSjoerg auto Matcher = fieldDecl(hasName("i2")); 38806f32e7eSjoerg auto *From = getFirstDecl<FieldDecl>(Matcher, FromUnit); 38906f32e7eSjoerg 39006f32e7eSjoergIn this case we can see that an error is associated (``getImportDeclErrorIfAny``) to the specialization also, not just to the field: 39106f32e7eSjoerg 39206f32e7eSjoerg.. code-block:: cpp 39306f32e7eSjoerg 39406f32e7eSjoerg llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From); 39506f32e7eSjoerg if (!ImportedOrErr) { 39606f32e7eSjoerg llvm::Error Err = ImportedOrErr.takeError(); 39706f32e7eSjoerg consumeError(std::move(Err)); 39806f32e7eSjoerg 39906f32e7eSjoerg // check that the ClassTemplateSpecializationDecl is also marked as 40006f32e7eSjoerg // erroneous. 40106f32e7eSjoerg auto *FromSpec = getFirstDecl<ClassTemplateSpecializationDecl>( 40206f32e7eSjoerg classTemplateSpecializationDecl(hasName("X")), FromUnit); 40306f32e7eSjoerg assert(Importer.getImportDeclErrorIfAny(FromSpec)); 40406f32e7eSjoerg // Btw, the error is also set for the FieldDecl. 40506f32e7eSjoerg assert(Importer.getImportDeclErrorIfAny(From)); 40606f32e7eSjoerg return 1; 40706f32e7eSjoerg } 40806f32e7eSjoerg 40906f32e7eSjoergPolluted AST 41006f32e7eSjoerg"""""""""""" 41106f32e7eSjoerg 41206f32e7eSjoergWe may recognize an error during the import of a dependent node. However, by that time, we had already created the dependant. 41306f32e7eSjoergIn these cases we do not remove the existing erroneous node from the "to" context, rather we associate an error to that node. 41406f32e7eSjoergLet's extend the previous example with another class ``Y``. 41506f32e7eSjoergThis class has a forward definition in the "to" context, but its definition is in the "from" context. 41606f32e7eSjoergWe'd like to import the definition, but it contains a member whose type conflicts with the type in the "to" context: 41706f32e7eSjoerg 41806f32e7eSjoerg.. code-block:: cpp 41906f32e7eSjoerg 42006f32e7eSjoerg std::unique_ptr<ASTUnit> ToUnit = buildASTFromCode( 42106f32e7eSjoerg R"( 42206f32e7eSjoerg // primary template 42306f32e7eSjoerg template <typename T> 42406f32e7eSjoerg struct X {}; 42506f32e7eSjoerg // explicit specialization 42606f32e7eSjoerg template<> 42706f32e7eSjoerg struct X<int> { int i; }; 42806f32e7eSjoerg 42906f32e7eSjoerg class Y; 43006f32e7eSjoerg )", 43106f32e7eSjoerg "to.cc"); 43206f32e7eSjoerg ToUnit->enableSourceFileDiagnostics(); 43306f32e7eSjoerg std::unique_ptr<ASTUnit> FromUnit = buildASTFromCode( 43406f32e7eSjoerg R"( 43506f32e7eSjoerg // primary template 43606f32e7eSjoerg template <typename T> 43706f32e7eSjoerg struct X {}; 43806f32e7eSjoerg // explicit specialization 43906f32e7eSjoerg template<> 44006f32e7eSjoerg struct X<int> { int i2; }; 44106f32e7eSjoerg // field mismatch: ^^ 44206f32e7eSjoerg 44306f32e7eSjoerg class Y { void f() { X<int> xi; } }; 44406f32e7eSjoerg )", 44506f32e7eSjoerg "from.cc"); 44606f32e7eSjoerg FromUnit->enableSourceFileDiagnostics(); 44706f32e7eSjoerg auto Matcher = cxxRecordDecl(hasName("Y")); 44806f32e7eSjoerg auto *From = getFirstDecl<CXXRecordDecl>(Matcher, FromUnit); 44906f32e7eSjoerg auto *To = getFirstDecl<CXXRecordDecl>(Matcher, ToUnit); 45006f32e7eSjoerg 45106f32e7eSjoergThis time we create a shared_ptr for ``ASTImporterSharedState`` which owns the associated errors for the "to" context. 45206f32e7eSjoergNote, there may be several different ASTImporter objects which import into the same "to" context but from different "from" contexts; they should share the same ``ASTImporterSharedState``. 45306f32e7eSjoerg(Also note, we have to include the corresponding ``ASTImporterSharedState.h`` header file.) 45406f32e7eSjoerg 45506f32e7eSjoerg.. code-block:: cpp 45606f32e7eSjoerg 45706f32e7eSjoerg auto ImporterState = std::make_shared<ASTImporterSharedState>(); 45806f32e7eSjoerg ASTImporter Importer(ToUnit->getASTContext(), ToUnit->getFileManager(), 45906f32e7eSjoerg FromUnit->getASTContext(), FromUnit->getFileManager(), 46006f32e7eSjoerg /*MinimalImport=*/false, ImporterState); 46106f32e7eSjoerg llvm::Expected<Decl *> ImportedOrErr = Importer.Import(From); 46206f32e7eSjoerg if (!ImportedOrErr) { 46306f32e7eSjoerg llvm::Error Err = ImportedOrErr.takeError(); 46406f32e7eSjoerg consumeError(std::move(Err)); 46506f32e7eSjoerg 46606f32e7eSjoerg // ... but the node had been created. 46706f32e7eSjoerg auto *ToYDef = getFirstDecl<CXXRecordDecl>( 46806f32e7eSjoerg cxxRecordDecl(hasName("Y"), isDefinition()), ToUnit); 46906f32e7eSjoerg ToYDef->dump(); 47006f32e7eSjoerg // An error is set for "ToYDef" in the shared state. 47106f32e7eSjoerg Optional<ImportError> OptErr = 47206f32e7eSjoerg ImporterState->getImportDeclErrorIfAny(ToYDef); 47306f32e7eSjoerg assert(OptErr); 47406f32e7eSjoerg 47506f32e7eSjoerg return 1; 47606f32e7eSjoerg } 47706f32e7eSjoerg 47806f32e7eSjoergIf we take a look at the AST, then we can see that the Decl with the definition is created, but the field is missing. 47906f32e7eSjoerg 48006f32e7eSjoerg.. code-block:: bash 48106f32e7eSjoerg 48206f32e7eSjoerg |-CXXRecordDecl 0xf66678 <line:9:7, col:13> col:13 class Y 48306f32e7eSjoerg `-CXXRecordDecl 0xf66730 prev 0xf66678 <:10:7, col:13> col:13 class Y definition 48406f32e7eSjoerg |-DefinitionData pass_in_registers empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init 48506f32e7eSjoerg | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr 48606f32e7eSjoerg | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param 48706f32e7eSjoerg | |-MoveConstructor exists simple trivial needs_implicit 48806f32e7eSjoerg | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param 48906f32e7eSjoerg | |-MoveAssignment exists simple trivial needs_implicit 49006f32e7eSjoerg | `-Destructor simple irrelevant trivial needs_implicit 49106f32e7eSjoerg `-CXXRecordDecl 0xf66828 <col:7, col:13> col:13 implicit class Y 49206f32e7eSjoerg 49306f32e7eSjoergWe do not remove the erroneous nodes because by the time when we recognize the error it is too late to remove the node, there may be additional references to that already in the AST. 49406f32e7eSjoergThis is aligned with the overall `design principle of the Clang AST <InternalsManual.html#immutability>`_: Clang AST nodes (types, declarations, statements, expressions, and so on) are generally designed to be **immutable once created**. 49506f32e7eSjoergThus, clients of the ASTImporter library should always check if there is any associated error for the node which they inspect in the destination context. 49606f32e7eSjoergWe recommend skipping the processing of those nodes which have an error associated with them. 49706f32e7eSjoerg 49806f32e7eSjoergUsing the ``-ast-merge`` Clang front-end action 49906f32e7eSjoerg----------------------------------------------- 50006f32e7eSjoerg 50106f32e7eSjoergThe ``-ast-merge <pch-file>`` command-line switch can be used to merge from the given serialized AST file. 50206f32e7eSjoergThis file represents the source context. 50306f32e7eSjoergWhen this switch is present then each top-level AST node of the source context is being merged into the destination context. 50406f32e7eSjoergIf the merge was successful then ``ASTConsumer::HandleTopLevelDecl`` is called for the Decl. 50506f32e7eSjoergThis results that we can execute the original front-end action on the extended AST. 50606f32e7eSjoerg 50706f32e7eSjoergExample for C 50806f32e7eSjoerg^^^^^^^^^^^^^ 50906f32e7eSjoerg 51006f32e7eSjoergLet's consider the following three files: 51106f32e7eSjoerg 51206f32e7eSjoerg.. code-block:: c 51306f32e7eSjoerg 51406f32e7eSjoerg // bar.h 51506f32e7eSjoerg #ifndef BAR_H 51606f32e7eSjoerg #define BAR_H 51706f32e7eSjoerg int bar(); 51806f32e7eSjoerg #endif /* BAR_H */ 51906f32e7eSjoerg 52006f32e7eSjoerg // bar.c 52106f32e7eSjoerg #include "bar.h" 52206f32e7eSjoerg int bar() { 52306f32e7eSjoerg return 41; 52406f32e7eSjoerg } 52506f32e7eSjoerg 52606f32e7eSjoerg // main.c 52706f32e7eSjoerg #include "bar.h" 52806f32e7eSjoerg int main() { 52906f32e7eSjoerg return bar(); 53006f32e7eSjoerg } 53106f32e7eSjoerg 53206f32e7eSjoergLet's generate the AST files for the two source files: 53306f32e7eSjoerg 53406f32e7eSjoerg.. code-block:: bash 53506f32e7eSjoerg 53606f32e7eSjoerg $ clang -cc1 -emit-pch -o bar.ast bar.c 53706f32e7eSjoerg $ clang -cc1 -emit-pch -o main.ast main.c 53806f32e7eSjoerg 53906f32e7eSjoergThen, let's check how the merged AST would look like if we consider only the ``bar()`` function: 54006f32e7eSjoerg 54106f32e7eSjoerg.. code-block:: bash 54206f32e7eSjoerg 54306f32e7eSjoerg $ clang -cc1 -ast-merge bar.ast -ast-merge main.ast /dev/null -ast-dump 54406f32e7eSjoerg TranslationUnitDecl 0x12b0738 <<invalid sloc>> <invalid sloc> 54506f32e7eSjoerg |-FunctionDecl 0x12b1470 </path/bar.h:4:1, col:9> col:5 used bar 'int ()' 54606f32e7eSjoerg |-FunctionDecl 0x12b1538 prev 0x12b1470 </path/bar.c:3:1, line:5:1> line:3:5 used bar 'int ()' 54706f32e7eSjoerg | `-CompoundStmt 0x12b1608 <col:11, line:5:1> 54806f32e7eSjoerg | `-ReturnStmt 0x12b15f8 <line:4:3, col:10> 54906f32e7eSjoerg | `-IntegerLiteral 0x12b15d8 <col:10> 'int' 41 55006f32e7eSjoerg |-FunctionDecl 0x12b1648 prev 0x12b1538 </path/bar.h:4:1, col:9> col:5 used bar 'int ()' 55106f32e7eSjoerg 55206f32e7eSjoergWe can inspect that the prototype of the function and the definition of it is merged into the same redeclaration chain. 55306f32e7eSjoergWhat's more there is a third prototype declaration merged to the chain. 55406f32e7eSjoergThe functions are merged in a way that prototypes are added to the redecl chain if they refer to the same type, but we can have only one definition. 55506f32e7eSjoergThe first two declarations are from ``bar.ast``, the third is from ``main.ast``. 55606f32e7eSjoerg 55706f32e7eSjoergNow, let's create an object file from the merged AST: 55806f32e7eSjoerg 55906f32e7eSjoerg.. code-block:: bash 56006f32e7eSjoerg 56106f32e7eSjoerg $ clang -cc1 -ast-merge bar.ast -ast-merge main.ast /dev/null -emit-obj -o main.o 56206f32e7eSjoerg 56306f32e7eSjoergNext, we may call the linker and execute the created binary file. 56406f32e7eSjoerg 56506f32e7eSjoerg.. code-block:: bash 56606f32e7eSjoerg 56706f32e7eSjoerg $ clang -o a.out main.o 56806f32e7eSjoerg $ ./a.out 56906f32e7eSjoerg $ echo $? 57006f32e7eSjoerg 41 57106f32e7eSjoerg $ 57206f32e7eSjoerg 57306f32e7eSjoergExample for C++ 57406f32e7eSjoerg^^^^^^^^^^^^^^^ 57506f32e7eSjoerg 57606f32e7eSjoergIn the case of C++, the generation of the AST files and the way how we invoke the front-end is a bit different. 57706f32e7eSjoergAssuming we have these three files: 57806f32e7eSjoerg 57906f32e7eSjoerg.. code-block:: cpp 58006f32e7eSjoerg 58106f32e7eSjoerg // foo.h 58206f32e7eSjoerg #ifndef FOO_H 58306f32e7eSjoerg #define FOO_H 58406f32e7eSjoerg struct foo { 58506f32e7eSjoerg virtual int fun(); 58606f32e7eSjoerg }; 58706f32e7eSjoerg #endif /* FOO_H */ 58806f32e7eSjoerg 58906f32e7eSjoerg // foo.cpp 59006f32e7eSjoerg #include "foo.h" 59106f32e7eSjoerg int foo::fun() { 59206f32e7eSjoerg return 42; 59306f32e7eSjoerg } 59406f32e7eSjoerg 59506f32e7eSjoerg // main.cpp 59606f32e7eSjoerg #include "foo.h" 59706f32e7eSjoerg int main() { 59806f32e7eSjoerg return foo().fun(); 59906f32e7eSjoerg } 60006f32e7eSjoerg 60106f32e7eSjoergWe shall generate the AST files, merge them, create the executable and then run it: 60206f32e7eSjoerg 60306f32e7eSjoerg.. code-block:: bash 60406f32e7eSjoerg 60506f32e7eSjoerg $ clang++ -x c++-header -o foo.ast foo.cpp 60606f32e7eSjoerg $ clang++ -x c++-header -o main.ast main.cpp 60706f32e7eSjoerg $ clang++ -cc1 -x c++ -ast-merge foo.ast -ast-merge main.ast /dev/null -ast-dump 60806f32e7eSjoerg $ clang++ -cc1 -x c++ -ast-merge foo.ast -ast-merge main.ast /dev/null -emit-obj -o main.o 60906f32e7eSjoerg $ clang++ -o a.out main.o 61006f32e7eSjoerg $ ./a.out 61106f32e7eSjoerg $ echo $? 61206f32e7eSjoerg 42 61306f32e7eSjoerg $ 614