1	-------------------------------------------------
2	Building EFI Applications Using the GNU Toolchain
3	-------------------------------------------------
4
5		David Mosberger <davidm@hpl.hp.com>
6
7			23 September 1999
8
9
10		Copyright (c) 1999-2007 Hewlett-Packard Co.
11		Copyright (c) 2006-2010 Intel Co.
12
13Last update: 04/09/2007
14
15* Introduction
16
17This document has two parts: the first part describes how to develop
18EFI applications for IA-64,x86 and x86_64 using the GNU toolchain and the EFI
19development environment contained in this directory.  The second part
20describes some of the more subtle aspects of how this development
21environment works.
22
23
24
25* Part 1: Developing EFI Applications
26
27
28** Prerequisites:
29
30 To develop x86 and x86_64 EFI applications, the following tools are needed:
31
32	- gcc-3.0 or newer (gcc 2.7.2 is NOT sufficient!)
33	  As of gnu-efi-3.0b, the Redhat 8.0 toolchain is known to work,
34	  but the Redhat 9.0 toolchain is not currently supported.
35
36	- A version of "objcopy" that supports EFI applications.  To
37	  check if your version includes EFI support, issue the
38	  command:
39
40		objcopy --help
41
42	  and verify that the line "supported targets" contains the
43	  string "efi-app-ia32" and "efi-app-x86_64". The binutils release
44	  binutils-2.17.50.0.14 supports Intel64 EFI.
45
46	- For debugging purposes, it's useful to have a version of
47	  "objdump" that supports EFI applications as well.  This
48	  allows inspect and disassemble EFI binaries.
49
50 To develop IA-64 EFI applications, the following tools are needed:
51
52	- A version of gcc newer than July 30th 1999 (older versions
53	  had problems with generating position independent code).
54	  As of gnu-efi-3.0b, gcc-3.1 is known to work well.
55
56	- A version of "objcopy" that supports EFI applications.  To
57	  check if your version includes EFI support, issue the
58	  command:
59
60		objcopy --help
61
62	  and verify that the line "supported targets" contains the
63	  string "efi-app-ia64".
64
65	- For debugging purposes, it's useful to have a version of
66	  "objdump" that supports EFI applications as well.  This
67	  allows inspect and disassemble EFI binaries.
68
69
70** Directory Structure
71
72This EFI development environment contains the following
73subdirectories:
74
75 inc:   This directory contains the EFI-related include files.  The
76	files are taken from Intel's EFI source distribution, except
77	that various fixes were applied to make it compile with the
78	GNU toolchain.
79
80 lib:   This directory contains the source code for Intel's EFI library.
81	Again, the files are taken from Intel's EFI source
82	distribution, with changes to make them compile with the GNU
83	toolchain.
84
85 gnuefi: This directory contains the glue necessary to convert ELF64
86	binaries to EFI binaries.  Various runtime code bits, such as
87	a self-relocator are included as well.  This code has been
88	contributed by the Hewlett-Packard Company and is distributed
89	under the GNU GPL.
90
91 apps:	This directory contains a few simple EFI test apps.
92
93** Setup
94
95It is necessary to edit the Makefile in the directory containing this
96README file before EFI applications can be built.  Specifically, you
97should verify that macros CC, AS, LD, AR, RANLIB, and OBJCOPY point to
98the appropriate compiler, assembler, linker, ar, and ranlib binaries,
99respectively.
100
101If you're working in a cross-development environment, be sure to set
102macro ARCH to the desired target architecture ("ia32" for x86, "x86_64" for
103x86_64 and "ia64" for IA-64).  For convenience, this can also be done from
104the make command line (e.g., "make ARCH=ia64").
105
106
107** Building
108
109To build the sample EFI applications provided in subdirectory "apps",
110simply invoke "make" in the toplevel directory (the directory
111containing this README file).  This should build lib/libefi.a and
112gnuefi/libgnuefi.a first and then all the EFI applications such as a
113apps/t6.efi.
114
115
116** Running
117
118Just copy the EFI application (e.g., apps/t6.efi) to the EFI
119filesystem, boot EFI, and then select "Invoke EFI application" to run
120the application you want to test.  Alternatively, you can invoke the
121Intel-provided "nshell" application and then invoke your test binary
122via the command line interface that "nshell" provides.
123
124
125** Writing Your Own EFI Application
126
127Suppose you have your own EFI application in a file called
128"apps/myefiapp.c".  To get this application built by the GNU EFI build
129environment, simply add "myefiapp.efi" to macro TARGETS in
130apps/Makefile.  Once this is done, invoke "make" in the top level
131directory.  This should result in EFI application apps/myefiapp.efi,
132ready for execution.
133
134The GNU EFI build environment allows to write EFI applications as
135described in Intel's EFI documentation, except for two differences:
136
137 - The EFI application's entry point is always called "efi_main".  The
138   declaration of this routine is:
139
140    EFI_STATUS efi_main (EFI_HANDLE image, EFI_SYSTEM_TABLE *systab);
141
142 - UNICODE string literals must be written as W2U(L"Sample String")
143   instead of just L"Sample String".  The W2U() macro is defined in
144   <efilib.h>.  This header file also declares the function W2UCpy()
145   which allows to convert a wide string into a UNICODE string and
146   store the result in a programmer-supplied buffer.
147
148 - Calls to EFI services should be made via uefi_call_wrapper(). This
149   ensures appropriate parameter passing for the architecture.
150
151
152* Part 2: Inner Workings
153
154WARNING: This part contains all the gory detail of how the GNU EFI
155toolchain works.  Normal users do not have to worry about such
156details.  Reading this part incurs a definite risk of inducing severe
157headaches or other maladies.
158
159The basic idea behind the GNU EFI build environment is to use the GNU
160toolchain to build a normal ELF binary that, at the end, is converted
161to an EFI binary.  EFI binaries are really just PE32+ binaries.  PE
162stands for "Portable Executable" and is the object file format
163Microsoft is using on its Windows platforms.  PE is basically the COFF
164object file format with an MS-DOS2.0 compatible header slapped on in
165front of it.  The "32" in PE32+ stands for 32 bits, meaning that PE32
166is a 32-bit object file format.  The plus in "PE32+" indicates that
167this format has been hacked to allow loading a 4GB binary anywhere in
168a 64-bit address space (unlike ELF64, however, this is not a full
16964-bit object file format because the entire binary cannot span more
170than 4GB of address space).  EFI binaries are plain PE32+ binaries
171except that the "subsystem id" differs from normal Windows binaries.
172There are two flavors of EFI binaries: "applications" and "drivers"
173and each has there own subsystem id and are identical otherwise.  At
174present, the GNU EFI build environment supports the building of EFI
175applications only, though it would be trivial to generate drivers, as
176the only difference is the subsystem id.  For more details on PE32+,
177see the spec at
178
179	http://msdn.microsoft.com/library/specs/msdn_pecoff.htm.
180
181In theory, converting a suitable ELF64 binary to PE32+ is easy and
182could be accomplished with the "objcopy" utility by specifying option
183--target=efi-app-ia32 (x86) or --target=efi-app-ia64 (IA-64).  But
184life never is that easy, so here some complicating factors:
185
186 (1) COFF sections are very different from ELF sections.
187
188	ELF binaries distinguish between program headers and sections.
189	The program headers describe the memory segments that need to
190	be loaded/initialized, whereas the sections describe what
191	constitutes those segments.  In COFF (and therefore PE32+) no
192	such distinction is made.  Thus, COFF sections need to be page
193	aligned and have a size that is a multiple of the page size
194	(4KB for EFI), whereas ELF allows sections at arbitrary
195	addresses and with arbitrary sizes.
196
197 (2) EFI binaries should be relocatable.
198
199	Since EFI binaries are executed in physical mode, EFI cannot
200	guarantee that a given binary can be loaded at its preferred
201	address.  EFI does _try_ to load a binary at it's preferred
202	address, but if it can't do so, it will load it at another
203	address and then relocate the binary using the contents of the
204	.reloc section.
205
206 (3) On IA-64, the EFI entry point needs to point to a function
207     descriptor, not to the code address of the entry point.
208
209 (4) The EFI specification assumes that wide characters use UNICODE
210     encoding.
211
212	ANSI C does not specify the size or encoding that a wide
213	character uses.  These choices are "implementation defined".
214	On most UNIX systems, the GNU toolchain uses a wchar_t that is
215	4 bytes in size.  The encoding used for such characters is
216	(mostly) UCS4.
217
218In the following sections, we address how the GNU EFI build
219environment addresses each of these issues.
220
221
222** (1) Accommodating COFF Sections
223
224In order to satisfy the COFF constraint of page-sized and page-aligned
225sections, the GNU EFI build environment uses the special linker script
226in gnuefi/elf_$(ARCH)_efi.lds where $(ARCH) is the target architecture
227("ia32" for x86, "x86_64" for x86_64 and "ia64" for IA-64).
228This script is set up to create only eight COFF section, each page aligned
229and page sized.These eight sections are used to group together the much
230greater number of sections that are typically present in ELF object files.
231Specifically:
232
233 .hash
234	Collects the ELF .hash info (this section _must_ be the first
235	section in order to build a shared object file; the section is
236	not actually loaded or used at runtime).
237
238 .text
239	Collects all sections containing executable code.
240
241 .data
242	Collects read-only and read-write data, literal string data,
243	global offset tables, the uninitialized data segment (bss) and
244	various other sections containing data.
245
246	The reason read-only data is placed here instead of the in
247	.text is to make it possible to disassemble the .text section
248	without getting garbage due to read-only data.  Besides, since
249	EFI binaries execute in physical mode, differences in page
250	protection do not matter.
251
252	The reason the uninitialized data is placed in this section is
253	that the EFI loader appears to be unable to handle sections
254	that are allocated but not loaded from the binary.
255
256 .dynamic, .dynsym, .rela, .rel, .reloc
257	These sections contains the dynamic information necessary to
258	self-relocate the binary (see below).
259
260A couple of more points worth noting about the linker script:
261
262 o On IA-64, the global pointer symbol (__gp) needs to be placed such
263   that the _entire_ EFI binary can be addressed using the signed
264   22-bit offset that the "addl" instruction affords.  Specifically,
265   this means that __gp should be placed at ImageBase + 0x200000.
266   Strictly speaking, only a couple of symbols need to be addressable
267   in this fashion, so with some care it should be possible to build
268   binaries much larger than 4MB.  To get a list of symbols that need
269   to be addressable in this fashion, grep the assembly files in
270   directory gnuefi for the string "@gprel".
271
272 o The link address (ImageBase) of the binary is (arbitrarily) set to
273   zero.  This could be set to something larger to increase the chance
274   of EFI being able to load the binary without requiring relocation.
275   However, a start address of 0 makes debugging a wee bit easier
276   (great for those of us who can add, but not subtract... ;-).
277
278 o The relocation related sections (.dynamic, .rel, .rela, .reloc)
279   cannot be placed inside .data because some tools in the GNU
280   toolchain rely on the existence of these sections.
281
282 o Some sections in the ELF binary intentionally get dropped when
283   building the EFI binary.  Particularly noteworthy are the dynamic
284   relocation sections for the .plabel and .reloc sections.  It would
285   be _wrong_ to include these sections in the EFI binary because it
286   would result in .reloc and .plabel being relocated twice (once by
287   the EFI loader and once by the self-relocator; see below for a
288   description of the latter).  Specifically, only the sections
289   mentioned with the -j option in the final "objcopy" command are
290   retained in the EFI binary (see apps/Makefile).
291
292
293** (2) Building Relocatable Binaries
294
295ELF binaries are normally linked for a fixed load address and are thus
296not relocatable.  The only kind of ELF object that is relocatable are
297shared objects ("shared libraries").  However, even those objects are
298usually not completely position independent and therefore require
299runtime relocation by the dynamic loader.  For example, IA-64 binaries
300normally require relocation of the global offset table.
301
302The approach to building relocatable binaries in the GNU EFI build
303environment is to:
304
305 (a) build an ELF shared object
306
307 (b) link it together with a self-relocator that takes care of
308     applying the dynamic relocations that may be present in the
309     ELF shared object
310
311 (c) convert the resulting image to an EFI binary
312
313The self-relocator is of course architecture dependent.  The x86
314version can be found in gnuefi/reloc_ia32.c, the x86_64 version
315can be found in gnuefi/reloc_x86_64.c and the IA-64 version can be
316found in gnuefi/reloc_ia64.S.
317
318The self-relocator operates as follows: the startup code invokes it
319right after EFI has handed off control to the EFI binary at symbol
320"_start".  Upon activation, the self-relocator searches the .dynamic
321section (whose starting address is given by symbol _DYNAMIC) for the
322dynamic relocation information, which can be found in the DT_REL,
323DT_RELSZ, and DT_RELENT entries of the dynamic table (DT_RELA,
324DT_RELASZ, and DT_RELAENT in the case of rela relocations, as is the
325case for IA-64).  The dynamic relocation information points to the ELF
326relocation table.  Once this table is found, the self-relocator walks
327through it, applying each relocation one by one.  Since the EFI
328binaries are fully resolved shared objects, only a subset of all
329possible relocations need to be supported.  Specifically, on x86 only
330the R_386_RELATIVE relocation is needed.  On IA-64, the relocations
331R_IA64_DIR64LSB, R_IA64_REL64LSB, and R_IA64_FPTR64LSB are needed.
332Note that the R_IA64_FPTR64LSB relocation requires access to the
333dynamic symbol table.  This is why the .dynsym section is included in
334the EFI binary.  Another complication is that this relocation requires
335memory to hold the function descriptors (aka "procedure labels" or
336"plabels").  Each function descriptor uses 16 bytes of memory.  The
337IA-64 self-relocator currently reserves a static memory area that can
338hold 100 of these descriptors.  If the self-relocator runs out of
339space, it causes the EFI binary to fail with error code 5
340(EFI_BUFFER_TOO_SMALL).  When this happens, the manifest constant
341MAX_FUNCTION_DESCRIPTORS in gnuefi/reloc_ia64.S should be increased
342and the application recompiled.  An easy way to count the number of
343function descriptors required by an EFI application is to run the
344command:
345
346  objdump --dynamic-reloc example.so | fgrep FPTR64 | wc -l
347
348assuming "example" is the name of the desired EFI application.
349
350
351** (3) Creating the Function Descriptor for the IA-64 EFI Binaries
352
353As mentioned above, the IA-64 PE32+ format assumes that the entry
354point of the binary is a function descriptor.  A function descriptors
355consists of two double words: the first one is the code entry point
356and the second is the global pointer that should be loaded before
357calling the entry point.  Since the ELF toolchain doesn't know how to
358generate a function descriptor for the entry point, the startup code
359in gnuefi/crt0-efi-ia64.S crafts one manually by with the code:
360
361	        .section .plabel, "a"
362	_start_plabel:
363	        data8   _start
364	        data8   __gp
365
366this places the procedure label for entry point _start in a section
367called ".plabel".  Now, the only problem is that _start and __gp need
368to be relocated _before_ EFI hands control over to the EFI binary.
369Fortunately, PE32+ defines a section called ".reloc" that can achieve
370this.  Thus, in addition to manually crafting the function descriptor,
371the startup code also crafts a ".reloc" section that has will cause
372the EFI loader to relocate the function descriptor before handing over
373control to the EFI binary (again, see the PECOFF spec mentioned above
374for details).
375
376A final question may be why .plabel and .reloc need to go in their own
377COFF sections.  The answer is simply: we need to be able to discard
378the relocation entries that are generated for these sections.  By
379placing them in these sections, the relocations end up in sections
380".rela.plabel" and ".rela.reloc" which makes it easy to filter them
381out in the filter script.  Also, the ".reloc" section needs to be in
382its own section so that the objcopy program can recognize it and can
383create the correct directory entries in the PE32+ binary.
384
385
386** (4) Convenient and Portable Generation of UNICODE String Literals
387
388As of gnu-efi-3.0, we make use (and somewhat abuse) the gcc option
389that forces wide characters (WCHAR_T) to use short integers (2 bytes)
390instead of integers (4 bytes). This way we match the Unicode character
391size. By abuse, we mean that we rely on the fact that the regular ASCII
392characters are encoded the same way between (short) wide characters
393and Unicode and basically only use the first byte. This allows us
394to just use them interchangeably.
395
396The gcc option to force short wide characters is : -fshort-wchar
397
398			* * * The End * * *
399