1===============================
2MCJIT Design and Implementation
3===============================
4
5Introduction
6============
7
8This document describes the internal workings of the MCJIT execution
9engine and the RuntimeDyld component.  It is intended as a high level
10overview of the implementation, showing the flow and interactions of
11objects throughout the code generation and dynamic loading process.
12
13Engine Creation
14===============
15
16In most cases, an EngineBuilder object is used to create an instance of
17the MCJIT execution engine.  The EngineBuilder takes an llvm::Module
18object as an argument to its constructor.  The client may then set various
19options that we control the later be passed along to the MCJIT engine,
20including the selection of MCJIT as the engine type to be created.
21Of particular interest is the EngineBuilder::setMCJITMemoryManager
22function.  If the client does not explicitly create a memory manager at
23this time, a default memory manager (specifically SectionMemoryManager)
24will be created when the MCJIT engine is instantiated.
25
26Once the options have been set, a client calls EngineBuilder::create to
27create an instance of the MCJIT engine.  If the client does not use the
28form of this function that takes a TargetMachine as a parameter, a new
29TargetMachine will be created based on the target triple associated with
30the Module that was used to create the EngineBuilder.
31
32.. image:: MCJIT-engine-builder.png
33
34EngineBuilder::create will call the static MCJIT::createJIT function,
35passing in its pointers to the module, memory manager and target machine
36objects, all of which will subsequently be owned by the MCJIT object.
37
38The MCJIT class has a member variable, Dyld, which contains an instance of
39the RuntimeDyld wrapper class.  This member will be used for
40communications between MCJIT and the actual RuntimeDyldImpl object that
41gets created when an object is loaded.
42
43.. image:: MCJIT-creation.png
44
45Upon creation, MCJIT holds a pointer to the Module object that it received
46from EngineBuilder but it does not immediately generate code for this
47module.  Code generation is deferred until either the
48MCJIT::finalizeObject method is called explicitly or a function such as
49MCJIT::getPointerToFunction is called which requires the code to have been
50generated.
51
52Code Generation
53===============
54
55When code generation is triggered, as described above, MCJIT will first
56attempt to retrieve an object image from its ObjectCache member, if one
57has been set.  If a cached object image cannot be retrieved, MCJIT will
58call its emitObject method.  MCJIT::emitObject uses a local PassManager
59instance and creates a new ObjectBufferStream instance, both of which it
60passes to TargetManager::addPassesToEmitMC before calling PassManager::run
61on the Module with which it was created.
62
63.. image:: MCJIT-load.png
64
65The PassManager::run call causes the MC code generation mechanisms to emit
66a complete relocatable binary object image (either in either ELF or MachO
67format, depending on the target) into the ObjectBufferStream object, which
68is flushed to complete the process.  If an ObjectCache is being used, the
69image will be passed to the ObjectCache here.
70
71At this point, the ObjectBufferStream contains the raw object image.
72Before the code can be executed, the code and data sections from this
73image must be loaded into suitable memory, relocations must be applied and
74memory permission and code cache invalidation (if required) must be completed.
75
76Object Loading
77==============
78
79Once an object image has been obtained, either through code generation or
80having been retrieved from an ObjectCache, it is passed to RuntimeDyld to
81be loaded.  The RuntimeDyld wrapper class examines the object to determine
82its file format and creates an instance of either RuntimeDyldELF or
83RuntimeDyldMachO (both of which derive from the RuntimeDyldImpl base
84class) and calls the RuntimeDyldImpl::loadObject method to perform that
85actual loading.
86
87.. image:: MCJIT-dyld-load.png
88
89RuntimeDyldImpl::loadObject begins by creating an ObjectImage instance
90from the ObjectBuffer it received.  ObjectImage, which wraps the
91ObjectFile class, is a helper class which parses the binary object image
92and provides access to the information contained in the format-specific
93headers, including section, symbol and relocation information.
94
95RuntimeDyldImpl::loadObject then iterates through the symbols in the
96image.  Information about common symbols is collected for later use.  For
97each function or data symbol, the associated section is loaded into memory
98and the symbol is stored in a symbol table map data structure.  When the
99iteration is complete, a section is emitted for the common symbols.
100
101Next, RuntimeDyldImpl::loadObject iterates through the sections in the
102object image and for each section iterates through the relocations for
103that sections.  For each relocation, it calls the format-specific
104processRelocationRef method, which will examine the relocation and store
105it in one of two data structures, a section-based relocation list map and
106an external symbol relocation map.
107
108.. image:: MCJIT-load-object.png
109
110When RuntimeDyldImpl::loadObject returns, all of the code and data
111sections for the object will have been loaded into memory allocated by the
112memory manager and relocation information will have been prepared, but the
113relocations have not yet been applied and the generated code is still not
114ready to be executed.
115
116[Currently (as of August 2013) the MCJIT engine will immediately apply
117relocations when loadObject completes.  However, this shouldn't be
118happening.  Because the code may have been generated for a remote target,
119the client should be given a chance to re-map the section addresses before
120relocations are applied.  It is possible to apply relocations multiple
121times, but in the case where addresses are to be re-mapped, this first
122application is wasted effort.]
123
124Address Remapping
125=================
126
127At any time after initial code has been generated and before
128finalizeObject is called, the client can remap the address of sections in
129the object.  Typically this is done because the code was generated for an
130external process and is being mapped into that process' address space.
131The client remaps the section address by calling MCJIT::mapSectionAddress.
132This should happen before the section memory is copied to its new
133location.
134
135When MCJIT::mapSectionAddress is called, MCJIT passes the call on to
136RuntimeDyldImpl (via its Dyld member).  RuntimeDyldImpl stores the new
137address in an internal data structure but does not update the code at this
138time, since other sections are likely to change.
139
140When the client is finished remapping section addresses, it will call
141MCJIT::finalizeObject to complete the remapping process.
142
143Final Preparations
144==================
145
146When MCJIT::finalizeObject is called, MCJIT calls
147RuntimeDyld::resolveRelocations.  This function will attempt to locate any
148external symbols and then apply all relocations for the object.
149
150External symbols are resolved by calling the memory manager's
151getPointerToNamedFunction method.  The memory manager will return the
152address of the requested symbol in the target address space.  (Note, this
153may not be a valid pointer in the host process.)  RuntimeDyld will then
154iterate through the list of relocations it has stored which are associated
155with this symbol and invoke the resolveRelocation method which, through an
156format-specific implementation, will apply the relocation to the loaded
157section memory.
158
159Next, RuntimeDyld::resolveRelocations iterates through the list of
160sections and for each section iterates through a list of relocations that
161have been saved which reference that symbol and call resolveRelocation for
162each entry in this list.  The relocation list here is a list of
163relocations for which the symbol associated with the relocation is located
164in the section associated with the list.  Each of these locations will
165have a target location at which the relocation will be applied that is
166likely located in a different section.
167
168.. image:: MCJIT-resolve-relocations.png
169
170Once relocations have been applied as described above, MCJIT calls
171RuntimeDyld::getEHFrameSection, and if a non-zero result is returned
172passes the section data to the memory manager's registerEHFrames method.
173This allows the memory manager to call any desired target-specific
174functions, such as registering the EH frame information with a debugger.
175
176Finally, MCJIT calls the memory manager's finalizeMemory method.  In this
177method, the memory manager will invalidate the target code cache, if
178necessary, and apply final permissions to the memory pages it has
179allocated for code and data memory.
180
181