1=============================== 2MCJIT Design and Implementation 3=============================== 4 5Introduction 6============ 7 8This document describes the internal workings of the MCJIT execution 9engine and the RuntimeDyld component. It is intended as a high level 10overview of the implementation, showing the flow and interactions of 11objects throughout the code generation and dynamic loading process. 12 13Engine Creation 14=============== 15 16In most cases, an EngineBuilder object is used to create an instance of 17the MCJIT execution engine. The EngineBuilder takes an llvm::Module 18object as an argument to its constructor. The client may then set various 19options that we control the later be passed along to the MCJIT engine, 20including the selection of MCJIT as the engine type to be created. 21Of particular interest is the EngineBuilder::setMCJITMemoryManager 22function. If the client does not explicitly create a memory manager at 23this time, a default memory manager (specifically SectionMemoryManager) 24will be created when the MCJIT engine is instantiated. 25 26Once the options have been set, a client calls EngineBuilder::create to 27create an instance of the MCJIT engine. If the client does not use the 28form of this function that takes a TargetMachine as a parameter, a new 29TargetMachine will be created based on the target triple associated with 30the Module that was used to create the EngineBuilder. 31 32.. image:: MCJIT-engine-builder.png 33 34EngineBuilder::create will call the static MCJIT::createJIT function, 35passing in its pointers to the module, memory manager and target machine 36objects, all of which will subsequently be owned by the MCJIT object. 37 38The MCJIT class has a member variable, Dyld, which contains an instance of 39the RuntimeDyld wrapper class. This member will be used for 40communications between MCJIT and the actual RuntimeDyldImpl object that 41gets created when an object is loaded. 42 43.. image:: MCJIT-creation.png 44 45Upon creation, MCJIT holds a pointer to the Module object that it received 46from EngineBuilder but it does not immediately generate code for this 47module. Code generation is deferred until either the 48MCJIT::finalizeObject method is called explicitly or a function such as 49MCJIT::getPointerToFunction is called which requires the code to have been 50generated. 51 52Code Generation 53=============== 54 55When code generation is triggered, as described above, MCJIT will first 56attempt to retrieve an object image from its ObjectCache member, if one 57has been set. If a cached object image cannot be retrieved, MCJIT will 58call its emitObject method. MCJIT::emitObject uses a local PassManager 59instance and creates a new ObjectBufferStream instance, both of which it 60passes to TargetMachine::addPassesToEmitMC before calling PassManager::run 61on the Module with which it was created. 62 63.. image:: MCJIT-load.png 64 65The PassManager::run call causes the MC code generation mechanisms to emit 66a complete relocatable binary object image (either in either ELF or MachO 67format, depending on the target) into the ObjectBufferStream object, which 68is flushed to complete the process. If an ObjectCache is being used, the 69image will be passed to the ObjectCache here. 70 71At this point, the ObjectBufferStream contains the raw object image. 72Before the code can be executed, the code and data sections from this 73image must be loaded into suitable memory, relocations must be applied and 74memory permission and code cache invalidation (if required) must be completed. 75 76Object Loading 77============== 78 79Once an object image has been obtained, either through code generation or 80having been retrieved from an ObjectCache, it is passed to RuntimeDyld to 81be loaded. The RuntimeDyld wrapper class examines the object to determine 82its file format and creates an instance of either RuntimeDyldELF or 83RuntimeDyldMachO (both of which derive from the RuntimeDyldImpl base 84class) and calls the RuntimeDyldImpl::loadObject method to perform that 85actual loading. 86 87.. image:: MCJIT-dyld-load.png 88 89RuntimeDyldImpl::loadObject begins by creating an ObjectImage instance 90from the ObjectBuffer it received. ObjectImage, which wraps the 91ObjectFile class, is a helper class which parses the binary object image 92and provides access to the information contained in the format-specific 93headers, including section, symbol and relocation information. 94 95RuntimeDyldImpl::loadObject then iterates through the symbols in the 96image. Information about common symbols is collected for later use. For 97each function or data symbol, the associated section is loaded into memory 98and the symbol is stored in a symbol table map data structure. When the 99iteration is complete, a section is emitted for the common symbols. 100 101Next, RuntimeDyldImpl::loadObject iterates through the sections in the 102object image and for each section iterates through the relocations for 103that sections. For each relocation, it calls the format-specific 104processRelocationRef method, which will examine the relocation and store 105it in one of two data structures, a section-based relocation list map and 106an external symbol relocation map. 107 108.. image:: MCJIT-load-object.png 109 110When RuntimeDyldImpl::loadObject returns, all of the code and data 111sections for the object will have been loaded into memory allocated by the 112memory manager and relocation information will have been prepared, but the 113relocations have not yet been applied and the generated code is still not 114ready to be executed. 115 116[Currently (as of August 2013) the MCJIT engine will immediately apply 117relocations when loadObject completes. However, this shouldn't be 118happening. Because the code may have been generated for a remote target, 119the client should be given a chance to re-map the section addresses before 120relocations are applied. It is possible to apply relocations multiple 121times, but in the case where addresses are to be re-mapped, this first 122application is wasted effort.] 123 124Address Remapping 125================= 126 127At any time after initial code has been generated and before 128finalizeObject is called, the client can remap the address of sections in 129the object. Typically this is done because the code was generated for an 130external process and is being mapped into that process' address space. 131The client remaps the section address by calling MCJIT::mapSectionAddress. 132This should happen before the section memory is copied to its new 133location. 134 135When MCJIT::mapSectionAddress is called, MCJIT passes the call on to 136RuntimeDyldImpl (via its Dyld member). RuntimeDyldImpl stores the new 137address in an internal data structure but does not update the code at this 138time, since other sections are likely to change. 139 140When the client is finished remapping section addresses, it will call 141MCJIT::finalizeObject to complete the remapping process. 142 143Final Preparations 144================== 145 146When MCJIT::finalizeObject is called, MCJIT calls 147RuntimeDyld::resolveRelocations. This function will attempt to locate any 148external symbols and then apply all relocations for the object. 149 150External symbols are resolved by calling the memory manager's 151getPointerToNamedFunction method. The memory manager will return the 152address of the requested symbol in the target address space. (Note, this 153may not be a valid pointer in the host process.) RuntimeDyld will then 154iterate through the list of relocations it has stored which are associated 155with this symbol and invoke the resolveRelocation method which, through an 156format-specific implementation, will apply the relocation to the loaded 157section memory. 158 159Next, RuntimeDyld::resolveRelocations iterates through the list of 160sections and for each section iterates through a list of relocations that 161have been saved which reference that symbol and call resolveRelocation for 162each entry in this list. The relocation list here is a list of 163relocations for which the symbol associated with the relocation is located 164in the section associated with the list. Each of these locations will 165have a target location at which the relocation will be applied that is 166likely located in a different section. 167 168.. image:: MCJIT-resolve-relocations.png 169 170Once relocations have been applied as described above, MCJIT calls 171RuntimeDyld::getEHFrameSection, and if a non-zero result is returned 172passes the section data to the memory manager's registerEHFrames method. 173This allows the memory manager to call any desired target-specific 174functions, such as registering the EH frame information with a debugger. 175 176Finally, MCJIT calls the memory manager's finalizeMemory method. In this 177method, the memory manager will invalidate the target code cache, if 178necessary, and apply final permissions to the memory pages it has 179allocated for code and data memory. 180