1\input texinfo 2@c Copyright (C) 1988-2021 Free Software Foundation, Inc. 3@setfilename bfdint.info 4 5@settitle BFD Internals 6@iftex 7@titlepage 8@title{BFD Internals} 9@author{Ian Lance Taylor} 10@author{Cygnus Solutions} 11@page 12@end iftex 13 14@copying 15This file documents the internals of the BFD library. 16 17Copyright @copyright{} 1988-2021 Free Software Foundation, Inc. 18Contributed by Cygnus Support. 19 20Permission is granted to copy, distribute and/or modify this document 21under the terms of the GNU Free Documentation License, Version 1.1 or 22any later version published by the Free Software Foundation; with the 23Invariant Sections being ``GNU General Public License'' and ``Funding 24Free Software'', the Front-Cover texts being (a) (see below), and with 25the Back-Cover Texts being (b) (see below). A copy of the license is 26included in the section entitled ``GNU Free Documentation License''. 27 28(a) The FSF's Front-Cover Text is: 29 30 A GNU Manual 31 32(b) The FSF's Back-Cover Text is: 33 34 You have freedom to copy and modify this GNU Manual, like GNU 35 software. Copies published by the Free Software Foundation raise 36 funds for GNU development. 37@end copying 38 39@node Top 40@top BFD Internals 41@raisesections 42@cindex bfd internals 43 44This document describes some BFD internal information which may be 45helpful when working on BFD. It is very incomplete. 46 47This document is not updated regularly, and may be out of date. 48 49The initial version of this document was written by Ian Lance Taylor 50@email{ian@@cygnus.com}. 51 52@menu 53* BFD overview:: BFD overview 54* BFD guidelines:: BFD programming guidelines 55* BFD target vector:: BFD target vector 56* BFD generated files:: BFD generated files 57* BFD multiple compilations:: Files compiled multiple times in BFD 58* BFD relocation handling:: BFD relocation handling 59* BFD ELF support:: BFD ELF support 60* BFD glossary:: Glossary 61* Index:: Index 62@end menu 63 64@node BFD overview 65@section BFD overview 66 67BFD is a library which provides a single interface to read and write 68object files, executables, archive files, and core files in any format. 69 70@menu 71* BFD library interfaces:: BFD library interfaces 72* BFD library users:: BFD library users 73* BFD view:: The BFD view of a file 74* BFD blindness:: BFD loses information 75@end menu 76 77@node BFD library interfaces 78@subsection BFD library interfaces 79 80One way to look at the BFD library is to divide it into four parts by 81type of interface. 82 83The first interface is the set of generic functions which programs using 84the BFD library will call. These generic function normally translate 85directly or indirectly into calls to routines which are specific to a 86particular object file format. Many of these generic functions are 87actually defined as macros in @file{bfd.h}. These functions comprise 88the official BFD interface. 89 90The second interface is the set of functions which appear in the target 91vectors. This is the bulk of the code in BFD. A target vector is a set 92of function pointers specific to a particular object file format. The 93target vector is used to implement the generic BFD functions. These 94functions are always called through the target vector, and are never 95called directly. The target vector is described in detail in @ref{BFD 96target vector}. The set of functions which appear in a particular 97target vector is often referred to as a BFD backend. 98 99The third interface is a set of oddball functions which are typically 100specific to a particular object file format, are not generic functions, 101and are called from outside of the BFD library. These are used as hooks 102by the linker and the assembler when a particular object file format 103requires some action which the BFD generic interface does not provide. 104These functions are typically declared in @file{bfd.h}, but in many 105cases they are only provided when BFD is configured with support for a 106particular object file format. These functions live in a grey area, and 107are not really part of the official BFD interface. 108 109The fourth interface is the set of BFD support functions which are 110called by the other BFD functions. These manage issues like memory 111allocation, error handling, file access, hash tables, swapping, and the 112like. These functions are never called from outside of the BFD library. 113 114@node BFD library users 115@subsection BFD library users 116 117Another way to look at the BFD library is to divide it into three parts 118by the manner in which it is used. 119 120The first use is to read an object file. The object file readers are 121programs like @samp{gdb}, @samp{nm}, @samp{objdump}, and @samp{objcopy}. 122These programs use BFD to view an object file in a generic form. The 123official BFD interface is normally fully adequate for these programs. 124 125The second use is to write an object file. The object file writers are 126programs like @samp{gas} and @samp{objcopy}. These programs use BFD to 127create an object file. The official BFD interface is normally adequate 128for these programs, but for some object file formats the assembler needs 129some additional hooks in order to set particular flags or other 130information. The official BFD interface includes functions to copy 131private information from one object file to another, and these functions 132are used by @samp{objcopy} to avoid information loss. 133 134The third use is to link object files. There is only one object file 135linker, @samp{ld}. Originally, @samp{ld} was an object file reader and 136an object file writer, and it did the link operation using the generic 137BFD structures. However, this turned out to be too slow and too memory 138intensive. 139 140The official BFD linker functions were written to permit specific BFD 141backends to perform the link without translating through the generic 142structures, in the normal case where all the input files and output file 143have the same object file format. Not all of the backends currently 144implement the new interface, and there are default linking functions 145within BFD which use the generic structures and which work with all 146backends. 147 148For several object file formats the linker needs additional hooks which 149are not provided by the official BFD interface, particularly for dynamic 150linking support. These functions are typically called from the linker 151emulation template. 152 153@node BFD view 154@subsection The BFD view of a file 155 156BFD uses generic structures to manage information. It translates data 157into the generic form when reading files, and out of the generic form 158when writing files. 159 160BFD describes a file as a pointer to the @samp{bfd} type. A @samp{bfd} 161is composed of the following elements. The BFD information can be 162displayed using the @samp{objdump} program with various options. 163 164@table @asis 165@item general information 166The object file format, a few general flags, the start address. 167@item architecture 168The architecture, including both a general processor type (m68k, MIPS 169etc.) and a specific machine number (m68000, R4000, etc.). 170@item sections 171A list of sections. 172@item symbols 173A symbol table. 174@end table 175 176BFD represents a section as a pointer to the @samp{asection} type. Each 177section has a name and a size. Most sections also have an associated 178block of data, known as the section contents. Sections also have 179associated flags, a virtual memory address, a load memory address, a 180required alignment, a list of relocations, and other miscellaneous 181information. 182 183BFD represents a relocation as a pointer to the @samp{arelent} type. A 184relocation describes an action which the linker must take to modify the 185section contents. Relocations have a symbol, an address, an addend, and 186a pointer to a howto structure which describes how to perform the 187relocation. For more information, see @ref{BFD relocation handling}. 188 189BFD represents a symbol as a pointer to the @samp{asymbol} type. A 190symbol has a name, a pointer to a section, an offset within that 191section, and some flags. 192 193Archive files do not have any sections or symbols. Instead, BFD 194represents an archive file as a file which contains a list of 195@samp{bfd}s. BFD also provides access to the archive symbol map, as a 196list of symbol names. BFD provides a function to return the @samp{bfd} 197within the archive which corresponds to a particular entry in the 198archive symbol map. 199 200@node BFD blindness 201@subsection BFD loses information 202 203Most object file formats have information which BFD can not represent in 204its generic form, at least as currently defined. 205 206There is often explicit information which BFD can not represent. For 207example, the COFF version stamp, or the ELF program segments. BFD 208provides special hooks to handle this information when copying, 209printing, or linking an object file. The BFD support for a particular 210object file format will normally store this information in private data 211and handle it using the special hooks. 212 213In some cases there is also implicit information which BFD can not 214represent. For example, the MIPS processor distinguishes small and 215large symbols, and requires that all small symbols be within 32K of the 216GP register. This means that the MIPS assembler must be able to mark 217variables as either small or large, and the MIPS linker must know to put 218small symbols within range of the GP register. Since BFD can not 219represent this information, this means that the assembler and linker 220must have information that is specific to a particular object file 221format which is outside of the BFD library. 222 223This loss of information indicates areas where the BFD paradigm breaks 224down. It is not actually possible to represent the myriad differences 225among object file formats using a single generic interface, at least not 226in the manner which BFD does it today. 227 228Nevertheless, the BFD library does greatly simplify the task of dealing 229with object files, and particular problems caused by information loss 230can normally be solved using some sort of relatively constrained hook 231into the library. 232 233 234 235@node BFD guidelines 236@section BFD programming guidelines 237@cindex bfd programming guidelines 238@cindex programming guidelines for bfd 239@cindex guidelines, bfd programming 240 241There is a lot of poorly written and confusing code in BFD. New BFD 242code should be written to a higher standard. Merely because some BFD 243code is written in a particular manner does not mean that you should 244emulate it. 245 246Here are some general BFD programming guidelines: 247 248@itemize @bullet 249@item 250Follow the GNU coding standards. 251 252@item 253Avoid global variables. We ideally want BFD to be fully reentrant, so 254that it can be used in multiple threads. All uses of global or static 255variables interfere with that. Initialized constant variables are OK, 256and they should be explicitly marked with @samp{const}. Instead of global 257variables, use data attached to a BFD or to a linker hash table. 258 259@item 260All externally visible functions should have names which start with 261@samp{bfd_}. All such functions should be declared in some header file, 262typically @file{bfd.h}. See, for example, the various declarations near 263the end of @file{bfd-in.h}, which mostly declare functions required by 264specific linker emulations. 265 266@item 267All functions which need to be visible from one file to another within 268BFD, but should not be visible outside of BFD, should start with 269@samp{_bfd_}. Although external names beginning with @samp{_} are 270prohibited by the ANSI standard, in practice this usage will always 271work, and it is required by the GNU coding standards. 272 273@item 274Always remember that people can compile using @samp{--enable-targets} to 275build several, or all, targets at once. It must be possible to link 276together the files for all targets. 277 278@item 279BFD code should compile with few or no warnings using @samp{gcc -Wall}. 280Some warnings are OK, like the absence of certain function declarations 281which may or may not be declared in system header files. Warnings about 282ambiguous expressions and the like should always be fixed. 283@end itemize 284 285@node BFD target vector 286@section BFD target vector 287@cindex bfd target vector 288@cindex target vector in bfd 289 290BFD supports multiple object file formats by using the @dfn{target 291vector}. This is simply a set of function pointers which implement 292behaviour that is specific to a particular object file format. 293 294In this section I list all of the entries in the target vector and 295describe what they do. 296 297@menu 298* BFD target vector miscellaneous:: Miscellaneous constants 299* BFD target vector swap:: Swapping functions 300* BFD target vector format:: Format type dependent functions 301* BFD_JUMP_TABLE macros:: BFD_JUMP_TABLE macros 302* BFD target vector generic:: Generic functions 303* BFD target vector copy:: Copy functions 304* BFD target vector core:: Core file support functions 305* BFD target vector archive:: Archive functions 306* BFD target vector symbols:: Symbol table functions 307* BFD target vector relocs:: Relocation support 308* BFD target vector write:: Output functions 309* BFD target vector link:: Linker functions 310* BFD target vector dynamic:: Dynamic linking information functions 311@end menu 312 313@node BFD target vector miscellaneous 314@subsection Miscellaneous constants 315 316The target vector starts with a set of constants. 317 318@table @samp 319@item name 320The name of the target vector. This is an arbitrary string. This is 321how the target vector is named in command-line options for tools which 322use BFD, such as the @samp{--oformat} linker option. 323 324@item flavour 325A general description of the type of target. The following flavours are 326currently defined: 327 328@table @samp 329@item bfd_target_unknown_flavour 330Undefined or unknown. 331@item bfd_target_aout_flavour 332a.out. 333@item bfd_target_coff_flavour 334COFF. 335@item bfd_target_ecoff_flavour 336ECOFF. 337@item bfd_target_elf_flavour 338ELF. 339@item bfd_target_tekhex_flavour 340Tektronix hex format. 341@item bfd_target_srec_flavour 342Motorola S-record format. 343@item bfd_target_ihex_flavour 344Intel hex format. 345@item bfd_target_som_flavour 346SOM (used on HP/UX). 347@item bfd_target_verilog_flavour 348Verilog memory hex dump format. 349@item bfd_target_os9k_flavour 350os9000. 351@item bfd_target_versados_flavour 352VERSAdos. 353@item bfd_target_msdos_flavour 354MS-DOS. 355@item bfd_target_evax_flavour 356openVMS. 357@item bfd_target_mmo_flavour 358Donald Knuth's MMIXware object format. 359@end table 360 361@item byteorder 362The byte order of data in the object file. One of 363@samp{BFD_ENDIAN_BIG}, @samp{BFD_ENDIAN_LITTLE}, or 364@samp{BFD_ENDIAN_UNKNOWN}. The latter would be used for a format such 365as S-records which do not record the architecture of the data. 366 367@item header_byteorder 368The byte order of header information in the object file. Normally the 369same as the @samp{byteorder} field, but there are certain cases where it 370may be different. 371 372@item object_flags 373Flags which may appear in the @samp{flags} field of a BFD with this 374format. 375 376@item section_flags 377Flags which may appear in the @samp{flags} field of a section within a 378BFD with this format. 379 380@item symbol_leading_char 381A character which the C compiler normally puts before a symbol. For 382example, an a.out compiler will typically generate the symbol 383@samp{_foo} for a function named @samp{foo} in the C source, in which 384case this field would be @samp{_}. If there is no such character, this 385field will be @samp{0}. 386 387@item ar_pad_char 388The padding character to use at the end of an archive name. Normally 389@samp{/}. 390 391@item ar_max_namelen 392The maximum length of a short name in an archive. Normally @samp{14}. 393 394@item backend_data 395A pointer to constant backend data. This is used by backends to store 396whatever additional information they need to distinguish similar target 397vectors which use the same sets of functions. 398@end table 399 400@node BFD target vector swap 401@subsection Swapping functions 402 403Every target vector has function pointers used for swapping information 404in and out of the target representation. There are two sets of 405functions: one for data information, and one for header information. 406Each set has three sizes: 64-bit, 32-bit, and 16-bit. Each size has 407three actual functions: put, get unsigned, and get signed. 408 409These 18 functions are used to convert data between the host and target 410representations. 411 412@node BFD target vector format 413@subsection Format type dependent functions 414 415Every target vector has three arrays of function pointers which are 416indexed by the BFD format type. The BFD format types are as follows: 417 418@table @samp 419@item bfd_unknown 420Unknown format. Not used for anything useful. 421@item bfd_object 422Object file. 423@item bfd_archive 424Archive file. 425@item bfd_core 426Core file. 427@end table 428 429The three arrays of function pointers are as follows: 430 431@table @samp 432@item bfd_check_format 433Check whether the BFD is of a particular format (object file, archive 434file, or core file) corresponding to this target vector. This is called 435by the @samp{bfd_check_format} function when examining an existing BFD. 436If the BFD matches the desired format, this function will initialize any 437format specific information such as the @samp{tdata} field of the BFD. 438This function must be called before any other BFD target vector function 439on a file opened for reading. 440 441@item bfd_set_format 442Set the format of a BFD which was created for output. This is called by 443the @samp{bfd_set_format} function after creating the BFD with a 444function such as @samp{bfd_openw}. This function will initialize format 445specific information required to write out an object file or whatever of 446the given format. This function must be called before any other BFD 447target vector function on a file opened for writing. 448 449@item bfd_write_contents 450Write out the contents of the BFD in the given format. This is called 451by @samp{bfd_close} function for a BFD opened for writing. This really 452should not be an array selected by format type, as the 453@samp{bfd_set_format} function provides all the required information. 454In fact, BFD will fail if a different format is used when calling 455through the @samp{bfd_set_format} and the @samp{bfd_write_contents} 456arrays; fortunately, since @samp{bfd_close} gets it right, this is a 457difficult error to make. 458@end table 459 460@node BFD_JUMP_TABLE macros 461@subsection @samp{BFD_JUMP_TABLE} macros 462@cindex @samp{BFD_JUMP_TABLE} 463 464Most target vectors are defined using @samp{BFD_JUMP_TABLE} macros. 465These macros take a single argument, which is a prefix applied to a set 466of functions. The macros are then used to initialize the fields in the 467target vector. 468 469For example, the @samp{BFD_JUMP_TABLE_RELOCS} macro defines three 470functions: @samp{_get_reloc_upper_bound}, @samp{_canonicalize_reloc}, 471and @samp{_bfd_reloc_type_lookup}. A reference like 472@samp{BFD_JUMP_TABLE_RELOCS (foo)} will expand into three functions 473prefixed with @samp{foo}: @samp{foo_get_reloc_upper_bound}, etc. The 474@samp{BFD_JUMP_TABLE_RELOCS} macro will be placed such that those three 475functions initialize the appropriate fields in the BFD target vector. 476 477This is done because it turns out that many different target vectors can 478share certain classes of functions. For example, archives are similar 479on most platforms, so most target vectors can use the same archive 480functions. Those target vectors all use @samp{BFD_JUMP_TABLE_ARCHIVE} 481with the same argument, calling a set of functions which is defined in 482@file{archive.c}. 483 484Each of the @samp{BFD_JUMP_TABLE} macros is mentioned below along with 485the description of the function pointers which it defines. The function 486pointers will be described using the name without the prefix which the 487@samp{BFD_JUMP_TABLE} macro defines. This name is normally the same as 488the name of the field in the target vector structure. Any differences 489will be noted. 490 491@node BFD target vector generic 492@subsection Generic functions 493@cindex @samp{BFD_JUMP_TABLE_GENERIC} 494 495The @samp{BFD_JUMP_TABLE_GENERIC} macro is used for some catch all 496functions which don't easily fit into other categories. 497 498@table @samp 499@item _close_and_cleanup 500Free any target specific information associated with the BFD. This is 501called when any BFD is closed (the @samp{bfd_write_contents} function 502mentioned earlier is only called for a BFD opened for writing). Most 503targets use @samp{bfd_alloc} to allocate all target specific 504information, and therefore don't have to do anything in this function. 505This function pointer is typically set to 506@samp{_bfd_generic_close_and_cleanup}, which simply returns true. 507 508@item _bfd_free_cached_info 509Free any cached information associated with the BFD which can be 510recreated later if necessary. This is used to reduce the memory 511consumption required by programs using BFD. This is normally called via 512the @samp{bfd_free_cached_info} macro. It is used by the default 513archive routines when computing the archive map. Most targets do not 514do anything special for this entry point, and just set it to 515@samp{_bfd_generic_free_cached_info}, which simply returns true. 516 517@item _new_section_hook 518This is called from @samp{bfd_make_section_anyway} whenever a new 519section is created. Most targets use it to initialize section specific 520information. This function is called whether or not the section 521corresponds to an actual section in an actual BFD. 522 523@item _get_section_contents 524Get the contents of a section. This is called from 525@samp{bfd_get_section_contents}. Most targets set this to 526@samp{_bfd_generic_get_section_contents}, which does a @samp{bfd_seek} 527based on the section's @samp{filepos} field and a @samp{bfd_bread}. The 528corresponding field in the target vector is named 529@samp{_bfd_get_section_contents}. 530 531@item _get_section_contents_in_window 532Set a @samp{bfd_window} to hold the contents of a section. This is 533called from @samp{bfd_get_section_contents_in_window}. The 534@samp{bfd_window} idea never really caught on, and I don't think this is 535ever called. Pretty much all targets implement this as 536@samp{bfd_generic_get_section_contents_in_window}, which uses 537@samp{bfd_get_section_contents} to do the right thing. The 538corresponding field in the target vector is named 539@samp{_bfd_get_section_contents_in_window}. 540@end table 541 542@node BFD target vector copy 543@subsection Copy functions 544@cindex @samp{BFD_JUMP_TABLE_COPY} 545 546The @samp{BFD_JUMP_TABLE_COPY} macro is used for functions which are 547called when copying BFDs, and for a couple of functions which deal with 548internal BFD information. 549 550@table @samp 551@item _bfd_copy_private_bfd_data 552This is called when copying a BFD, via @samp{bfd_copy_private_bfd_data}. 553If the input and output BFDs have the same format, this will copy any 554private information over. This is called after all the section contents 555have been written to the output file. Only a few targets do anything in 556this function. 557 558@item _bfd_merge_private_bfd_data 559This is called when linking, via @samp{bfd_merge_private_bfd_data}. It 560gives the backend linker code a chance to set any special flags in the 561output file based on the contents of the input file. Only a few targets 562do anything in this function. 563 564@item _bfd_copy_private_section_data 565This is similar to @samp{_bfd_copy_private_bfd_data}, but it is called 566for each section, via @samp{bfd_copy_private_section_data}. This 567function is called before any section contents have been written. Only 568a few targets do anything in this function. 569 570@item _bfd_copy_private_symbol_data 571This is called via @samp{bfd_copy_private_symbol_data}, but I don't 572think anything actually calls it. If it were defined, it could be used 573to copy private symbol data from one BFD to another. However, most BFDs 574store extra symbol information by allocating space which is larger than 575the @samp{asymbol} structure and storing private information in the 576extra space. Since @samp{objcopy} and other programs copy symbol 577information by copying pointers to @samp{asymbol} structures, the 578private symbol information is automatically copied as well. Most 579targets do not do anything in this function. 580 581@item _bfd_set_private_flags 582This is called via @samp{bfd_set_private_flags}. It is basically a hook 583for the assembler to set magic information. For example, the PowerPC 584ELF assembler uses it to set flags which appear in the e_flags field of 585the ELF header. Most targets do not do anything in this function. 586 587@item _bfd_print_private_bfd_data 588This is called by @samp{objdump} when the @samp{-p} option is used. It 589is called via @samp{bfd_print_private_data}. It prints any interesting 590information about the BFD which can not be otherwise represented by BFD 591and thus can not be printed by @samp{objdump}. Most targets do not do 592anything in this function. 593@end table 594 595@node BFD target vector core 596@subsection Core file support functions 597@cindex @samp{BFD_JUMP_TABLE_CORE} 598 599The @samp{BFD_JUMP_TABLE_CORE} macro is used for functions which deal 600with core files. Obviously, these functions only do something 601interesting for targets which have core file support. 602 603@table @samp 604@item _core_file_failing_command 605Given a core file, this returns the command which was run to produce the 606core file. 607 608@item _core_file_failing_signal 609Given a core file, this returns the signal number which produced the 610core file. 611 612@item _core_file_matches_executable_p 613Given a core file and a BFD for an executable, this returns whether the 614core file was generated by the executable. 615@end table 616 617@node BFD target vector archive 618@subsection Archive functions 619@cindex @samp{BFD_JUMP_TABLE_ARCHIVE} 620 621The @samp{BFD_JUMP_TABLE_ARCHIVE} macro is used for functions which deal 622with archive files. Most targets use COFF style archive files 623(including ELF targets), and these use @samp{_bfd_archive_coff} as the 624argument to @samp{BFD_JUMP_TABLE_ARCHIVE}. Some targets use BSD/a.out 625style archives, and these use @samp{_bfd_archive_bsd}. (The main 626difference between BSD and COFF archives is the format of the archive 627symbol table). Targets with no archive support use 628@samp{_bfd_noarchive}. Finally, a few targets have unusual archive 629handling. 630 631@table @samp 632@item _slurp_armap 633Read in the archive symbol table, storing it in private BFD data. This 634is normally called from the archive @samp{check_format} routine. The 635corresponding field in the target vector is named 636@samp{_bfd_slurp_armap}. 637 638@item _slurp_extended_name_table 639Read in the extended name table from the archive, if there is one, 640storing it in private BFD data. This is normally called from the 641archive @samp{check_format} routine. The corresponding field in the 642target vector is named @samp{_bfd_slurp_extended_name_table}. 643 644@item construct_extended_name_table 645Build and return an extended name table if one is needed to write out 646the archive. This also adjusts the archive headers to refer to the 647extended name table appropriately. This is normally called from the 648archive @samp{write_contents} routine. The corresponding field in the 649target vector is named @samp{_bfd_construct_extended_name_table}. 650 651@item _truncate_arname 652This copies a file name into an archive header, truncating it as 653required. It is normally called from the archive @samp{write_contents} 654routine. This function is more interesting in targets which do not 655support extended name tables, but I think the GNU @samp{ar} program 656always uses extended name tables anyhow. The corresponding field in the 657target vector is named @samp{_bfd_truncate_arname}. 658 659@item _write_armap 660Write out the archive symbol table using calls to @samp{bfd_bwrite}. 661This is normally called from the archive @samp{write_contents} routine. 662The corresponding field in the target vector is named @samp{write_armap} 663(no leading underscore). 664 665@item _read_ar_hdr 666Read and parse an archive header. This handles expanding the archive 667header name into the real file name using the extended name table. This 668is called by routines which read the archive symbol table or the archive 669itself. The corresponding field in the target vector is named 670@samp{_bfd_read_ar_hdr_fn}. 671 672@item _openr_next_archived_file 673Given an archive and a BFD representing a file stored within the 674archive, return a BFD for the next file in the archive. This is called 675via @samp{bfd_openr_next_archived_file}. The corresponding field in the 676target vector is named @samp{openr_next_archived_file} (no leading 677underscore). 678 679@item _get_elt_at_index 680Given an archive and an index, return a BFD for the file in the archive 681corresponding to that entry in the archive symbol table. This is called 682via @samp{bfd_get_elt_at_index}. The corresponding field in the target 683vector is named @samp{_bfd_get_elt_at_index}. 684 685@item _generic_stat_arch_elt 686Do a stat on an element of an archive, returning information read from 687the archive header (modification time, uid, gid, file mode, size). This 688is called via @samp{bfd_stat_arch_elt}. The corresponding field in the 689target vector is named @samp{_bfd_stat_arch_elt}. 690 691@item _update_armap_timestamp 692After the entire contents of an archive have been written out, update 693the timestamp of the archive symbol table to be newer than that of the 694file. This is required for a.out style archives. This is normally 695called by the archive @samp{write_contents} routine. The corresponding 696field in the target vector is named @samp{_bfd_update_armap_timestamp}. 697@end table 698 699@node BFD target vector symbols 700@subsection Symbol table functions 701@cindex @samp{BFD_JUMP_TABLE_SYMBOLS} 702 703The @samp{BFD_JUMP_TABLE_SYMBOLS} macro is used for functions which deal 704with symbols. 705 706@table @samp 707@item _get_symtab_upper_bound 708Return a sensible upper bound on the amount of memory which will be 709required to read the symbol table. In practice most targets return the 710amount of memory required to hold @samp{asymbol} pointers for all the 711symbols plus a trailing @samp{NULL} entry, and store the actual symbol 712information in BFD private data. This is called via 713@samp{bfd_get_symtab_upper_bound}. The corresponding field in the 714target vector is named @samp{_bfd_get_symtab_upper_bound}. 715 716@item _canonicalize_symtab 717Read in the symbol table. This is called via 718@samp{bfd_canonicalize_symtab}. The corresponding field in the target 719vector is named @samp{_bfd_canonicalize_symtab}. 720 721@item _make_empty_symbol 722Create an empty symbol for the BFD. This is needed because most targets 723store extra information with each symbol by allocating a structure 724larger than an @samp{asymbol} and storing the extra information at the 725end. This function will allocate the right amount of memory, and return 726what looks like a pointer to an empty @samp{asymbol}. This is called 727via @samp{bfd_make_empty_symbol}. The corresponding field in the target 728vector is named @samp{_bfd_make_empty_symbol}. 729 730@item _print_symbol 731Print information about the symbol. This is called via 732@samp{bfd_print_symbol}. One of the arguments indicates what sort of 733information should be printed: 734 735@table @samp 736@item bfd_print_symbol_name 737Just print the symbol name. 738@item bfd_print_symbol_more 739Print the symbol name and some interesting flags. I don't think 740anything actually uses this. 741@item bfd_print_symbol_all 742Print all information about the symbol. This is used by @samp{objdump} 743when run with the @samp{-t} option. 744@end table 745The corresponding field in the target vector is named 746@samp{_bfd_print_symbol}. 747 748@item _get_symbol_info 749Return a standard set of information about the symbol. This is called 750via @samp{bfd_symbol_info}. The corresponding field in the target 751vector is named @samp{_bfd_get_symbol_info}. 752 753@item _bfd_is_local_label_name 754Return whether the given string would normally represent the name of a 755local label. This is called via @samp{bfd_is_local_label} and 756@samp{bfd_is_local_label_name}. Local labels are normally discarded by 757the assembler. In the linker, this defines the difference between the 758@samp{-x} and @samp{-X} options. 759 760@item _get_lineno 761Return line number information for a symbol. This is only meaningful 762for a COFF target. This is called when writing out COFF line numbers. 763 764@item _find_nearest_line 765Given an address within a section, use the debugging information to find 766the matching file name, function name, and line number, if any. This is 767called via @samp{bfd_find_nearest_line}. The corresponding field in the 768target vector is named @samp{_bfd_find_nearest_line}. 769 770@item _bfd_make_debug_symbol 771Make a debugging symbol. This is only meaningful for a COFF target, 772where it simply returns a symbol which will be placed in the 773@samp{N_DEBUG} section when it is written out. This is called via 774@samp{bfd_make_debug_symbol}. 775 776@item _read_minisymbols 777Minisymbols are used to reduce the memory requirements of programs like 778@samp{nm}. A minisymbol is a cookie pointing to internal symbol 779information which the caller can use to extract complete symbol 780information. This permits BFD to not convert all the symbols into 781generic form, but to instead convert them one at a time. This is called 782via @samp{bfd_read_minisymbols}. Most targets do not implement this, 783and just use generic support which is based on using standard 784@samp{asymbol} structures. 785 786@item _minisymbol_to_symbol 787Convert a minisymbol to a standard @samp{asymbol}. This is called via 788@samp{bfd_minisymbol_to_symbol}. 789@end table 790 791@node BFD target vector relocs 792@subsection Relocation support 793@cindex @samp{BFD_JUMP_TABLE_RELOCS} 794 795The @samp{BFD_JUMP_TABLE_RELOCS} macro is used for functions which deal 796with relocations. 797 798@table @samp 799@item _get_reloc_upper_bound 800Return a sensible upper bound on the amount of memory which will be 801required to read the relocations for a section. In practice most 802targets return the amount of memory required to hold @samp{arelent} 803pointers for all the relocations plus a trailing @samp{NULL} entry, and 804store the actual relocation information in BFD private data. This is 805called via @samp{bfd_get_reloc_upper_bound}. 806 807@item _canonicalize_reloc 808Return the relocation information for a section. This is called via 809@samp{bfd_canonicalize_reloc}. The corresponding field in the target 810vector is named @samp{_bfd_canonicalize_reloc}. 811 812@item _bfd_reloc_type_lookup 813Given a relocation code, return the corresponding howto structure 814(@pxref{BFD relocation codes}). This is called via 815@samp{bfd_reloc_type_lookup}. The corresponding field in the target 816vector is named @samp{reloc_type_lookup}. 817@end table 818 819@node BFD target vector write 820@subsection Output functions 821@cindex @samp{BFD_JUMP_TABLE_WRITE} 822 823The @samp{BFD_JUMP_TABLE_WRITE} macro is used for functions which deal 824with writing out a BFD. 825 826@table @samp 827@item _set_arch_mach 828Set the architecture and machine number for a BFD. This is called via 829@samp{bfd_set_arch_mach}. Most targets implement this by calling 830@samp{bfd_default_set_arch_mach}. The corresponding field in the target 831vector is named @samp{_bfd_set_arch_mach}. 832 833@item _set_section_contents 834Write out the contents of a section. This is called via 835@samp{bfd_set_section_contents}. The corresponding field in the target 836vector is named @samp{_bfd_set_section_contents}. 837@end table 838 839@node BFD target vector link 840@subsection Linker functions 841@cindex @samp{BFD_JUMP_TABLE_LINK} 842 843The @samp{BFD_JUMP_TABLE_LINK} macro is used for functions called by the 844linker. 845 846@table @samp 847@item _sizeof_headers 848Return the size of the header information required for a BFD. This is 849used to implement the @samp{SIZEOF_HEADERS} linker script function. It 850is normally used to align the first section at an efficient position on 851the page. This is called via @samp{bfd_sizeof_headers}. The 852corresponding field in the target vector is named 853@samp{_bfd_sizeof_headers}. 854 855@item _bfd_get_relocated_section_contents 856Read the contents of a section and apply the relocation information. 857This handles both a final link and a relocatable link; in the latter 858case, it adjust the relocation information as well. This is called via 859@samp{bfd_get_relocated_section_contents}. Most targets implement it by 860calling @samp{bfd_generic_get_relocated_section_contents}. 861 862@item _bfd_relax_section 863Try to use relaxation to shrink the size of a section. This is called 864by the linker when the @samp{-relax} option is used. This is called via 865@samp{bfd_relax_section}. Most targets do not support any sort of 866relaxation. 867 868@item _bfd_link_hash_table_create 869Create the symbol hash table to use for the linker. This linker hook 870permits the backend to control the size and information of the elements 871in the linker symbol hash table. This is called via 872@samp{bfd_link_hash_table_create}. 873 874@item _bfd_link_add_symbols 875Given an object file or an archive, add all symbols into the linker 876symbol hash table. Use callbacks to the linker to include archive 877elements in the link. This is called via @samp{bfd_link_add_symbols}. 878 879@item _bfd_final_link 880Finish the linking process. The linker calls this hook after all of the 881input files have been read, when it is ready to finish the link and 882generate the output file. This is called via @samp{bfd_final_link}. 883 884@item _bfd_link_split_section 885I don't know what this is for. Nothing seems to call it. The only 886non-trivial definition is in @file{som.c}. 887@end table 888 889@node BFD target vector dynamic 890@subsection Dynamic linking information functions 891@cindex @samp{BFD_JUMP_TABLE_DYNAMIC} 892 893The @samp{BFD_JUMP_TABLE_DYNAMIC} macro is used for functions which read 894dynamic linking information. 895 896@table @samp 897@item _get_dynamic_symtab_upper_bound 898Return a sensible upper bound on the amount of memory which will be 899required to read the dynamic symbol table. In practice most targets 900return the amount of memory required to hold @samp{asymbol} pointers for 901all the symbols plus a trailing @samp{NULL} entry, and store the actual 902symbol information in BFD private data. This is called via 903@samp{bfd_get_dynamic_symtab_upper_bound}. The corresponding field in 904the target vector is named @samp{_bfd_get_dynamic_symtab_upper_bound}. 905 906@item _canonicalize_dynamic_symtab 907Read the dynamic symbol table. This is called via 908@samp{bfd_canonicalize_dynamic_symtab}. The corresponding field in the 909target vector is named @samp{_bfd_canonicalize_dynamic_symtab}. 910 911@item _get_dynamic_reloc_upper_bound 912Return a sensible upper bound on the amount of memory which will be 913required to read the dynamic relocations. In practice most targets 914return the amount of memory required to hold @samp{arelent} pointers for 915all the relocations plus a trailing @samp{NULL} entry, and store the 916actual relocation information in BFD private data. This is called via 917@samp{bfd_get_dynamic_reloc_upper_bound}. The corresponding field in 918the target vector is named @samp{_bfd_get_dynamic_reloc_upper_bound}. 919 920@item _canonicalize_dynamic_reloc 921Read the dynamic relocations. This is called via 922@samp{bfd_canonicalize_dynamic_reloc}. The corresponding field in the 923target vector is named @samp{_bfd_canonicalize_dynamic_reloc}. 924@end table 925 926@node BFD generated files 927@section BFD generated files 928@cindex generated files in bfd 929@cindex bfd generated files 930 931BFD contains several automatically generated files. This section 932describes them. Some files are created at configure time, when you 933configure BFD. Some files are created at make time, when you build 934BFD. Some files are automatically rebuilt at make time, but only if 935you configure with the @samp{--enable-maintainer-mode} option. Some 936files live in the object directory---the directory from which you run 937configure---and some live in the source directory. All files that live 938in the source directory are checked into the git repository. 939 940@table @file 941@item bfd.h 942@cindex @file{bfd.h} 943@cindex @file{bfd-in3.h} 944Lives in the object directory. Created at make time from 945@file{bfd-in2.h} via @file{bfd-in3.h}. @file{bfd-in3.h} is created at 946configure time from @file{bfd-in2.h}. There are automatic dependencies 947to rebuild @file{bfd-in3.h} and hence @file{bfd.h} if @file{bfd-in2.h} 948changes, so you can normally ignore @file{bfd-in3.h}, and just think 949about @file{bfd-in2.h} and @file{bfd.h}. 950 951@file{bfd.h} is built by replacing a few strings in @file{bfd-in2.h}. 952To see them, search for @samp{@@} in @file{bfd-in2.h}. They mainly 953control whether BFD is built for a 32 bit target or a 64 bit target. 954 955@item bfd-in2.h 956@cindex @file{bfd-in2.h} 957Lives in the source directory. Created from @file{bfd-in.h} and several 958other BFD source files. If you configure with the 959@samp{--enable-maintainer-mode} option, @file{bfd-in2.h} is rebuilt 960automatically when a source file changes. 961 962@item elf32-target.h 963@itemx elf64-target.h 964@cindex @file{elf32-target.h} 965@cindex @file{elf64-target.h} 966Live in the object directory. Created from @file{elfxx-target.h}. 967These files are versions of @file{elfxx-target.h} customized for either 968a 32 bit ELF target or a 64 bit ELF target. 969 970@item libbfd.h 971@cindex @file{libbfd.h} 972Lives in the source directory. Created from @file{libbfd-in.h} and 973several other BFD source files. If you configure with the 974@samp{--enable-maintainer-mode} option, @file{libbfd.h} is rebuilt 975automatically when a source file changes. 976 977@item libcoff.h 978@cindex @file{libcoff.h} 979Lives in the source directory. Created from @file{libcoff-in.h} and 980@file{coffcode.h}. If you configure with the 981@samp{--enable-maintainer-mode} option, @file{libcoff.h} is rebuilt 982automatically when a source file changes. 983 984@item targmatch.h 985@cindex @file{targmatch.h} 986Lives in the object directory. Created at make time from 987@file{config.bfd}. This file is used to map configuration triplets into 988BFD target vector variable names at run time. 989@end table 990 991@node BFD multiple compilations 992@section Files compiled multiple times in BFD 993Several files in BFD are compiled multiple times. By this I mean that 994there are header files which contain function definitions. These header 995files are included by other files, and thus the functions are compiled 996once per file which includes them. 997 998Preprocessor macros are used to control the compilation, so that each 999time the files are compiled the resulting functions are slightly 1000different. Naturally, if they weren't different, there would be no 1001reason to compile them multiple times. 1002 1003This is a not a particularly good programming technique, and future BFD 1004work should avoid it. 1005 1006@itemize @bullet 1007@item 1008Since this technique is rarely used, even experienced C programmers find 1009it confusing. 1010 1011@item 1012It is difficult to debug programs which use BFD, since there is no way 1013to describe which version of a particular function you are looking at. 1014 1015@item 1016Programs which use BFD wind up incorporating two or more slightly 1017different versions of the same function, which wastes space in the 1018executable. 1019 1020@item 1021This technique is never required nor is it especially efficient. It is 1022always possible to use statically initialized structures holding 1023function pointers and magic constants instead. 1024@end itemize 1025 1026The following is a list of the files which are compiled multiple times. 1027 1028@table @file 1029@item aout-target.h 1030@cindex @file{aout-target.h} 1031Describes a few functions and the target vector for a.out targets. This 1032is used by individual a.out targets with different definitions of 1033@samp{N_TXTADDR} and similar a.out macros. 1034 1035@item aoutf1.h 1036@cindex @file{aoutf1.h} 1037Implements standard SunOS a.out files. In principle it supports 64 bit 1038a.out targets based on the preprocessor macro @samp{ARCH_SIZE}, but 1039since all known a.out targets are 32 bits, this code may or may not 1040work. This file is only included by a few other files, and it is 1041difficult to justify its existence. 1042 1043@item aoutx.h 1044@cindex @file{aoutx.h} 1045Implements basic a.out support routines. This file can be compiled for 1046either 32 or 64 bit support. Since all known a.out targets are 32 bits, 1047the 64 bit support may or may not work. I believe the original 1048intention was that this file would only be included by @samp{aout32.c} 1049and @samp{aout64.c}, and that other a.out targets would simply refer to 1050the functions it defined. Unfortunately, some other a.out targets 1051started including it directly, leading to a somewhat confused state of 1052affairs. 1053 1054@item coffcode.h 1055@cindex @file{coffcode.h} 1056Implements basic COFF support routines. This file is included by every 1057COFF target. It implements code which handles COFF magic numbers as 1058well as various hook functions called by the generic COFF functions in 1059@file{coffgen.c}. This file is controlled by a number of different 1060macros, and more are added regularly. 1061 1062@item coffswap.h 1063@cindex @file{coffswap.h} 1064Implements COFF swapping routines. This file is included by 1065@file{coffcode.h}, and thus by every COFF target. It implements the 1066routines which swap COFF structures between internal and external 1067format. The main control for this file is the external structure 1068definitions in the files in the @file{include/coff} directory. A COFF 1069target file will include one of those files before including 1070@file{coffcode.h} and thus @file{coffswap.h}. There are a few other 1071macros which affect @file{coffswap.h} as well, mostly describing whether 1072certain fields are present in the external structures. 1073 1074@item ecoffswap.h 1075@cindex @file{ecoffswap.h} 1076Implements ECOFF swapping routines. This is like @file{coffswap.h}, but 1077for ECOFF. It is included by the ECOFF target files (of which there are 1078only two). The control is the preprocessor macro @samp{ECOFF_32} or 1079@samp{ECOFF_64}. 1080 1081@item elfcode.h 1082@cindex @file{elfcode.h} 1083Implements ELF functions that use external structure definitions. This 1084file is included by two other files: @file{elf32.c} and @file{elf64.c}. 1085It is controlled by the @samp{ARCH_SIZE} macro which is defined to be 1086@samp{32} or @samp{64} before including it. The @samp{NAME} macro is 1087used internally to give the functions different names for the two target 1088sizes. 1089 1090@item elfcore.h 1091@cindex @file{elfcore.h} 1092Like @file{elfcode.h}, but for functions that are specific to ELF core 1093files. This is included only by @file{elfcode.h}. 1094 1095@item elfxx-target.h 1096@cindex @file{elfxx-target.h} 1097This file is the source for the generated files @file{elf32-target.h} 1098and @file{elf64-target.h}, one of which is included by every ELF target. 1099It defines the ELF target vector. 1100 1101@item netbsd.h 1102@cindex @file{netbsd.h} 1103Used by all netbsd aout targets. Several other files include it. 1104 1105@item peicode.h 1106@cindex @file{peicode.h} 1107Provides swapping routines and other hooks for PE targets. 1108@file{coffcode.h} will include this rather than @file{coffswap.h} for a 1109PE target. This defines PE specific versions of the COFF swapping 1110routines, and also defines some macros which control @file{coffcode.h} 1111itself. 1112@end table 1113 1114@node BFD relocation handling 1115@section BFD relocation handling 1116@cindex bfd relocation handling 1117@cindex relocations in bfd 1118 1119The handling of relocations is one of the more confusing aspects of BFD. 1120Relocation handling has been implemented in various different ways, all 1121somewhat incompatible, none perfect. 1122 1123@menu 1124* BFD relocation concepts:: BFD relocation concepts 1125* BFD relocation functions:: BFD relocation functions 1126* BFD relocation codes:: BFD relocation codes 1127* BFD relocation future:: BFD relocation future 1128@end menu 1129 1130@node BFD relocation concepts 1131@subsection BFD relocation concepts 1132 1133A relocation is an action which the linker must take when linking. It 1134describes a change to the contents of a section. The change is normally 1135based on the final value of one or more symbols. Relocations are 1136created by the assembler when it creates an object file. 1137 1138Most relocations are simple. A typical simple relocation is to set 32 1139bits at a given offset in a section to the value of a symbol. This type 1140of relocation would be generated for code like @code{int *p = &i;} where 1141@samp{p} and @samp{i} are global variables. A relocation for the symbol 1142@samp{i} would be generated such that the linker would initialize the 1143area of memory which holds the value of @samp{p} to the value of the 1144symbol @samp{i}. 1145 1146Slightly more complex relocations may include an addend, which is a 1147constant to add to the symbol value before using it. In some cases a 1148relocation will require adding the symbol value to the existing contents 1149of the section in the object file. In others the relocation will simply 1150replace the contents of the section with the symbol value. Some 1151relocations are PC relative, so that the value to be stored in the 1152section is the difference between the value of a symbol and the final 1153address of the section contents. 1154 1155In general, relocations can be arbitrarily complex. For example, 1156relocations used in dynamic linking systems often require the linker to 1157allocate space in a different section and use the offset within that 1158section as the value to store. 1159 1160When doing a relocatable link, the linker may or may not have to do 1161anything with a relocation, depending upon the definition of the 1162relocation. Simple relocations generally do not require any special 1163action. 1164 1165@node BFD relocation functions 1166@subsection BFD relocation functions 1167 1168In BFD, each section has an array of @samp{arelent} structures. Each 1169structure has a pointer to a symbol, an address within the section, an 1170addend, and a pointer to a @samp{reloc_howto_struct} structure. The 1171howto structure has a bunch of fields describing the reloc, including a 1172type field. The type field is specific to the object file format 1173backend; none of the generic code in BFD examines it. 1174 1175Originally, the function @samp{bfd_perform_relocation} was supposed to 1176handle all relocations. In theory, many relocations would be simple 1177enough to be described by the fields in the howto structure. For those 1178that weren't, the howto structure included a @samp{special_function} 1179field to use as an escape. 1180 1181While this seems plausible, a look at @samp{bfd_perform_relocation} 1182shows that it failed. The function has odd special cases. Some of the 1183fields in the howto structure, such as @samp{pcrel_offset}, were not 1184adequately documented. 1185 1186The linker uses @samp{bfd_perform_relocation} to do all relocations when 1187the input and output file have different formats (e.g., when generating 1188S-records). The generic linker code, which is used by all targets which 1189do not define their own special purpose linker, uses 1190@samp{bfd_get_relocated_section_contents}, which for most targets turns 1191into a call to @samp{bfd_generic_get_relocated_section_contents}, which 1192calls @samp{bfd_perform_relocation}. So @samp{bfd_perform_relocation} 1193is still widely used, which makes it difficult to change, since it is 1194difficult to test all possible cases. 1195 1196The assembler used @samp{bfd_perform_relocation} for a while. This 1197turned out to be the wrong thing to do, since 1198@samp{bfd_perform_relocation} was written to handle relocations on an 1199existing object file, while the assembler needed to create relocations 1200in a new object file. The assembler was changed to use the new function 1201@samp{bfd_install_relocation} instead, and @samp{bfd_install_relocation} 1202was created as a copy of @samp{bfd_perform_relocation}. 1203 1204Unfortunately, the work did not progress any farther, so 1205@samp{bfd_install_relocation} remains a simple copy of 1206@samp{bfd_perform_relocation}, with all the odd special cases and 1207confusing code. This again is difficult to change, because again any 1208change can affect any assembler target, and so is difficult to test. 1209 1210The new linker, when using the same object file format for all input 1211files and the output file, does not convert relocations into 1212@samp{arelent} structures, so it can not use 1213@samp{bfd_perform_relocation} at all. Instead, users of the new linker 1214are expected to write a @samp{relocate_section} function which will 1215handle relocations in a target specific fashion. 1216 1217There are two helper functions for target specific relocation: 1218@samp{_bfd_final_link_relocate} and @samp{_bfd_relocate_contents}. 1219These functions use a howto structure, but they @emph{do not} use the 1220@samp{special_function} field. Since the functions are normally called 1221from target specific code, the @samp{special_function} field adds 1222little; any relocations which require special handling can be handled 1223without calling those functions. 1224 1225So, if you want to add a new target, or add a new relocation to an 1226existing target, you need to do the following: 1227 1228@itemize @bullet 1229@item 1230Make sure you clearly understand what the contents of the section should 1231look like after assembly, after a relocatable link, and after a final 1232link. Make sure you clearly understand the operations the linker must 1233perform during a relocatable link and during a final link. 1234 1235@item 1236Write a howto structure for the relocation. The howto structure is 1237flexible enough to represent any relocation which should be handled by 1238setting a contiguous bitfield in the destination to the value of a 1239symbol, possibly with an addend, possibly adding the symbol value to the 1240value already present in the destination. 1241 1242@item 1243Change the assembler to generate your relocation. The assembler will 1244call @samp{bfd_install_relocation}, so your howto structure has to be 1245able to handle that. You may need to set the @samp{special_function} 1246field to handle assembly correctly. Be careful to ensure that any code 1247you write to handle the assembler will also work correctly when doing a 1248relocatable link. For example, see @samp{bfd_elf_generic_reloc}. 1249 1250@item 1251Test the assembler. Consider the cases of relocation against an 1252undefined symbol, a common symbol, a symbol defined in the object file 1253in the same section, and a symbol defined in the object file in a 1254different section. These cases may not all be applicable for your 1255reloc. 1256 1257@item 1258If your target uses the new linker, which is recommended, add any 1259required handling to the target specific relocation function. In simple 1260cases this will just involve a call to @samp{_bfd_final_link_relocate} 1261or @samp{_bfd_relocate_contents}, depending upon the definition of the 1262relocation and whether the link is relocatable or not. 1263 1264@item 1265Test the linker. Test the case of a final link. If the relocation can 1266overflow, use a linker script to force an overflow and make sure the 1267error is reported correctly. Test a relocatable link, whether the 1268symbol is defined or undefined in the relocatable output. For both the 1269final and relocatable link, test the case when the symbol is a common 1270symbol, when the symbol looked like a common symbol but became a defined 1271symbol, when the symbol is defined in a different object file, and when 1272the symbol is defined in the same object file. 1273 1274@item 1275In order for linking to another object file format, such as S-records, 1276to work correctly, @samp{bfd_perform_relocation} has to do the right 1277thing for the relocation. You may need to set the 1278@samp{special_function} field to handle this correctly. Test this by 1279doing a link in which the output object file format is S-records. 1280 1281@item 1282Using the linker to generate relocatable output in a different object 1283file format is impossible in the general case, so you generally don't 1284have to worry about that. The GNU linker makes sure to stop that from 1285happening when an input file in a different format has relocations. 1286 1287Linking input files of different object file formats together is quite 1288unusual, but if you're really dedicated you may want to consider testing 1289this case, both when the output object file format is the same as your 1290format, and when it is different. 1291@end itemize 1292 1293@node BFD relocation codes 1294@subsection BFD relocation codes 1295 1296BFD has another way of describing relocations besides the howto 1297structures described above: the enum @samp{bfd_reloc_code_real_type}. 1298 1299Every known relocation type can be described as a value in this 1300enumeration. The enumeration contains many target specific relocations, 1301but where two or more targets have the same relocation, a single code is 1302used. For example, the single value @samp{BFD_RELOC_32} is used for all 1303simple 32 bit relocation types. 1304 1305The main purpose of this relocation code is to give the assembler some 1306mechanism to create @samp{arelent} structures. In order for the 1307assembler to create an @samp{arelent} structure, it has to be able to 1308obtain a howto structure. The function @samp{bfd_reloc_type_lookup}, 1309which simply calls the target vector entry point 1310@samp{reloc_type_lookup}, takes a relocation code and returns a howto 1311structure. 1312 1313The function @samp{bfd_get_reloc_code_name} returns the name of a 1314relocation code. This is mainly used in error messages. 1315 1316Using both howto structures and relocation codes can be somewhat 1317confusing. There are many processor specific relocation codes. 1318However, the relocation is only fully defined by the howto structure. 1319The same relocation code will map to different howto structures in 1320different object file formats. For example, the addend handling may be 1321different. 1322 1323Most of the relocation codes are not really general. The assembler can 1324not use them without already understanding what sorts of relocations can 1325be used for a particular target. It might be possible to replace the 1326relocation codes with something simpler. 1327 1328@node BFD relocation future 1329@subsection BFD relocation future 1330 1331Clearly the current BFD relocation support is in bad shape. A 1332wholescale rewrite would be very difficult, because it would require 1333thorough testing of every BFD target. So some sort of incremental 1334change is required. 1335 1336My vague thoughts on this would involve defining a new, clearly defined, 1337howto structure. Some mechanism would be used to determine which type 1338of howto structure was being used by a particular format. 1339 1340The new howto structure would clearly define the relocation behaviour in 1341the case of an assembly, a relocatable link, and a final link. At 1342least one special function would be defined as an escape, and it might 1343make sense to define more. 1344 1345One or more generic functions similar to @samp{bfd_perform_relocation} 1346would be written to handle the new howto structure. 1347 1348This should make it possible to write a generic version of the relocate 1349section functions used by the new linker. The target specific code 1350would provide some mechanism (a function pointer or an initial 1351conversion) to convert target specific relocations into howto 1352structures. 1353 1354Ideally it would be possible to use this generic relocate section 1355function for the generic linker as well. That is, it would replace the 1356@samp{bfd_generic_get_relocated_section_contents} function which is 1357currently normally used. 1358 1359For the special case of ELF dynamic linking, more consideration needs to 1360be given to writing ELF specific but ELF target generic code to handle 1361special relocation types such as GOT and PLT. 1362 1363@node BFD ELF support 1364@section BFD ELF support 1365@cindex elf support in bfd 1366@cindex bfd elf support 1367 1368The ELF object file format is defined in two parts: a generic ABI and a 1369processor specific supplement. The ELF support in BFD is split in a 1370similar fashion. The processor specific support is largely kept within 1371a single file. The generic support is provided by several other files. 1372The processor specific support provides a set of function pointers and 1373constants used by the generic support. 1374 1375@menu 1376* BFD ELF sections and segments:: ELF sections and segments 1377* BFD ELF generic support:: BFD ELF generic support 1378* BFD ELF processor specific support:: BFD ELF processor specific support 1379* BFD ELF core files:: BFD ELF core files 1380* BFD ELF future:: BFD ELF future 1381@end menu 1382 1383@node BFD ELF sections and segments 1384@subsection ELF sections and segments 1385 1386The ELF ABI permits a file to have either sections or segments or both. 1387Relocatable object files conventionally have only sections. 1388Executables conventionally have both. Core files conventionally have 1389only program segments. 1390 1391ELF sections are similar to sections in other object file formats: they 1392have a name, a VMA, file contents, flags, and other miscellaneous 1393information. ELF relocations are stored in sections of a particular 1394type; BFD automatically converts these sections into internal relocation 1395information. 1396 1397ELF program segments are intended for fast interpretation by a system 1398loader. They have a type, a VMA, an LMA, file contents, and a couple of 1399other fields. When an ELF executable is run on a Unix system, the 1400system loader will examine the program segments to decide how to load 1401it. The loader will ignore the section information. Loadable program 1402segments (type @samp{PT_LOAD}) are directly loaded into memory. Other 1403program segments are interpreted by the loader, and generally provide 1404dynamic linking information. 1405 1406When an ELF file has both program segments and sections, an ELF program 1407segment may encompass one or more ELF sections, in the sense that the 1408portion of the file which corresponds to the program segment may include 1409the portions of the file corresponding to one or more sections. When 1410there is more than one section in a loadable program segment, the 1411relative positions of the section contents in the file must correspond 1412to the relative positions they should hold when the program segment is 1413loaded. This requirement should be obvious if you consider that the 1414system loader will load an entire program segment at a time. 1415 1416On a system which supports dynamic paging, such as any native Unix 1417system, the contents of a loadable program segment must be at the same 1418offset in the file as in memory, modulo the memory page size used on the 1419system. This is because the system loader will map the file into memory 1420starting at the start of a page. The system loader can easily remap 1421entire pages to the correct load address. However, if the contents of 1422the file were not correctly aligned within the page, the system loader 1423would have to shift the contents around within the page, which is too 1424expensive. For example, if the LMA of a loadable program segment is 1425@samp{0x40080} and the page size is @samp{0x1000}, then the position of 1426the segment contents within the file must equal @samp{0x80} modulo 1427@samp{0x1000}. 1428 1429BFD has only a single set of sections. It does not provide any generic 1430way to examine both sections and segments. When BFD is used to open an 1431object file or executable, the BFD sections will represent ELF sections. 1432When BFD is used to open a core file, the BFD sections will represent 1433ELF program segments. 1434 1435When BFD is used to examine an object file or executable, any program 1436segments will be read to set the LMA of the sections. This is because 1437ELF sections only have a VMA, while ELF program segments have both a VMA 1438and an LMA. Any program segments will be copied by the 1439@samp{copy_private} entry points. They will be printed by the 1440@samp{print_private} entry point. Otherwise, the program segments are 1441ignored. In particular, programs which use BFD currently have no direct 1442access to the program segments. 1443 1444When BFD is used to create an executable, the program segments will be 1445created automatically based on the section information. This is done in 1446the function @samp{assign_file_positions_for_segments} in @file{elf.c}. 1447This function has been tweaked many times, and probably still has 1448problems that arise in particular cases. 1449 1450There is a hook which may be used to explicitly define the program 1451segments when creating an executable: the @samp{bfd_record_phdr} 1452function in @file{bfd.c}. If this function is called, BFD will not 1453create program segments itself, but will only create the program 1454segments specified by the caller. The linker uses this function to 1455implement the @samp{PHDRS} linker script command. 1456 1457@node BFD ELF generic support 1458@subsection BFD ELF generic support 1459 1460In general, functions which do not read external data from the ELF file 1461are found in @file{elf.c}. They operate on the internal forms of the 1462ELF structures, which are defined in @file{include/elf/internal.h}. The 1463internal structures are defined in terms of @samp{bfd_vma}, and so may 1464be used for both 32 bit and 64 bit ELF targets. 1465 1466The file @file{elfcode.h} contains functions which operate on the 1467external data. @file{elfcode.h} is compiled twice, once via 1468@file{elf32.c} with @samp{ARCH_SIZE} defined as @samp{32}, and once via 1469@file{elf64.c} with @samp{ARCH_SIZE} defined as @samp{64}. 1470@file{elfcode.h} includes functions to swap the ELF structures in and 1471out of external form, as well as a few more complex functions. 1472 1473Linker support is found in @file{elflink.c}. The 1474linker support is only used if the processor specific file defines 1475@samp{elf_backend_relocate_section}, which is required to relocate the 1476section contents. If that macro is not defined, the generic linker code 1477is used, and relocations are handled via @samp{bfd_perform_relocation}. 1478 1479The core file support is in @file{elfcore.h}, which is compiled twice, 1480for both 32 and 64 bit support. The more interesting cases of core file 1481support only work on a native system which has the @file{sys/procfs.h} 1482header file. Without that file, the core file support does little more 1483than read the ELF program segments as BFD sections. 1484 1485The BFD internal header file @file{elf-bfd.h} is used for communication 1486among these files and the processor specific files. 1487 1488The default entries for the BFD ELF target vector are found mainly in 1489@file{elf.c}. Some functions are found in @file{elfcode.h}. 1490 1491The processor specific files may override particular entries in the 1492target vector, but most do not, with one exception: the 1493@samp{bfd_reloc_type_lookup} entry point is always processor specific. 1494 1495@node BFD ELF processor specific support 1496@subsection BFD ELF processor specific support 1497 1498By convention, the processor specific support for a particular processor 1499will be found in @file{elf@var{nn}-@var{cpu}.c}, where @var{nn} is 1500either 32 or 64, and @var{cpu} is the name of the processor. 1501 1502@menu 1503* BFD ELF processor required:: Required processor specific support 1504* BFD ELF processor linker:: Processor specific linker support 1505* BFD ELF processor other:: Other processor specific support options 1506@end menu 1507 1508@node BFD ELF processor required 1509@subsubsection Required processor specific support 1510 1511When writing a @file{elf@var{nn}-@var{cpu}.c} file, you must do the 1512following: 1513 1514@itemize @bullet 1515@item 1516Define either @samp{TARGET_BIG_SYM} or @samp{TARGET_LITTLE_SYM}, or 1517both, to a unique C name to use for the target vector. This name should 1518appear in the list of target vectors in @file{targets.c}, and will also 1519have to appear in @file{config.bfd} and @file{configure.ac}. Define 1520@samp{TARGET_BIG_SYM} for a big-endian processor, 1521@samp{TARGET_LITTLE_SYM} for a little-endian processor, and define both 1522for a bi-endian processor. 1523@item 1524Define either @samp{TARGET_BIG_NAME} or @samp{TARGET_LITTLE_NAME}, or 1525both, to a string used as the name of the target vector. This is the 1526name which a user of the BFD tool would use to specify the object file 1527format. It would normally appear in a linker emulation parameters 1528file. 1529@item 1530Define @samp{ELF_ARCH} to the BFD architecture (an element of the 1531@samp{bfd_architecture} enum, typically @samp{bfd_arch_@var{cpu}}). 1532@item 1533Define @samp{ELF_MACHINE_CODE} to the magic number which should appear 1534in the @samp{e_machine} field of the ELF header. As of this writing, 1535these magic numbers are assigned by Caldera; if you want to get a magic 1536number for a particular processor, try sending a note to 1537@email{registry@@caldera.com}. In the BFD sources, the magic numbers are 1538found in @file{include/elf/common.h}; they have names beginning with 1539@samp{EM_}. 1540@item 1541Define @samp{ELF_MAXPAGESIZE} to the maximum size of a virtual page in 1542memory. This can normally be found at the start of chapter 5 in the 1543processor specific supplement. For a processor which will only be used 1544in an embedded system, or which has no memory management hardware, this 1545can simply be @samp{1}. 1546@item 1547If the format should use @samp{Rel} rather than @samp{Rela} relocations, 1548define @samp{USE_REL}. This is normally defined in chapter 4 of the 1549processor specific supplement. 1550 1551In the absence of a supplement, it's easier to work with @samp{Rela} 1552relocations. @samp{Rela} relocations will require more space in object 1553files (but not in executables, except when using dynamic linking). 1554However, this is outweighed by the simplicity of addend handling when 1555using @samp{Rela} relocations. With @samp{Rel} relocations, the addend 1556must be stored in the section contents, which makes relocatable links 1557more complex. 1558 1559For example, consider C code like @code{i = a[1000];} where @samp{a} is 1560a global array. The instructions which load the value of @samp{a[1000]} 1561will most likely use a relocation which refers to the symbol 1562representing @samp{a}, with an addend that gives the offset from the 1563start of @samp{a} to element @samp{1000}. When using @samp{Rel} 1564relocations, that addend must be stored in the instructions themselves. 1565If you are adding support for a RISC chip which uses two or more 1566instructions to load an address, then the addend may not fit in a single 1567instruction, and will have to be somehow split among the instructions. 1568This makes linking awkward, particularly when doing a relocatable link 1569in which the addend may have to be updated. It can be done---the MIPS 1570ELF support does it---but it should be avoided when possible. 1571 1572It is possible, though somewhat awkward, to support both @samp{Rel} and 1573@samp{Rela} relocations for a single target; @file{elf64-mips.c} does it 1574by overriding the relocation reading and writing routines. 1575@item 1576Define howto structures for all the relocation types. 1577@item 1578Define a @samp{bfd_reloc_type_lookup} routine. This must be named 1579@samp{bfd_elf@var{nn}_bfd_reloc_type_lookup}, and may be either a 1580function or a macro. It must translate a BFD relocation code into a 1581howto structure. This is normally a table lookup or a simple switch. 1582@item 1583If using @samp{Rel} relocations, define @samp{elf_info_to_howto_rel}. 1584If using @samp{Rela} relocations, define @samp{elf_info_to_howto}. 1585Either way, this is a macro defined as the name of a function which 1586takes an @samp{arelent} and a @samp{Rel} or @samp{Rela} structure, and 1587sets the @samp{howto} field of the @samp{arelent} based on the 1588@samp{Rel} or @samp{Rela} structure. This is normally uses 1589@samp{ELF@var{nn}_R_TYPE} to get the ELF relocation type and uses it as 1590an index into a table of howto structures. 1591@end itemize 1592 1593You must also add the magic number for this processor to the 1594@samp{prep_headers} function in @file{elf.c}. 1595 1596You must also create a header file in the @file{include/elf} directory 1597called @file{@var{cpu}.h}. This file should define any target specific 1598information which may be needed outside of the BFD code. In particular 1599it should use the @samp{START_RELOC_NUMBERS}, @samp{RELOC_NUMBER}, 1600@samp{FAKE_RELOC}, @samp{EMPTY_RELOC} and @samp{END_RELOC_NUMBERS} 1601macros to create a table mapping the number used to identify a 1602relocation to a name describing that relocation. 1603 1604While not a BFD component, you probably also want to make the binutils 1605program @samp{readelf} parse your ELF objects. For this, you need to add 1606code for @code{EM_@var{cpu}} as appropriate in @file{binutils/readelf.c}. 1607 1608@node BFD ELF processor linker 1609@subsubsection Processor specific linker support 1610 1611The linker will be much more efficient if you define a relocate section 1612function. This will permit BFD to use the ELF specific linker support. 1613 1614If you do not define a relocate section function, BFD must use the 1615generic linker support, which requires converting all symbols and 1616relocations into BFD @samp{asymbol} and @samp{arelent} structures. In 1617this case, relocations will be handled by calling 1618@samp{bfd_perform_relocation}, which will use the howto structures you 1619have defined. @xref{BFD relocation handling}. 1620 1621In order to support linking into a different object file format, such as 1622S-records, @samp{bfd_perform_relocation} must work correctly with your 1623howto structures, so you can't skip that step. However, if you define 1624the relocate section function, then in the normal case of linking into 1625an ELF file the linker will not need to convert symbols and relocations, 1626and will be much more efficient. 1627 1628To use a relocation section function, define the macro 1629@samp{elf_backend_relocate_section} as the name of a function which will 1630take the contents of a section, as well as relocation, symbol, and other 1631information, and modify the section contents according to the relocation 1632information. In simple cases, this is little more than a loop over the 1633relocations which computes the value of each relocation and calls 1634@samp{_bfd_final_link_relocate}. The function must check for a 1635relocatable link, and in that case normally needs to do nothing other 1636than adjust the addend for relocations against a section symbol. 1637 1638The complex cases generally have to do with dynamic linker support. GOT 1639and PLT relocations must be handled specially, and the linker normally 1640arranges to set up the GOT and PLT sections while handling relocations. 1641When generating a shared library, random relocations must normally be 1642copied into the shared library, or converted to RELATIVE relocations 1643when possible. 1644 1645@node BFD ELF processor other 1646@subsubsection Other processor specific support options 1647 1648There are many other macros which may be defined in 1649@file{elf@var{nn}-@var{cpu}.c}. These macros may be found in 1650@file{elfxx-target.h}. 1651 1652Macros may be used to override some of the generic ELF target vector 1653functions. 1654 1655Several processor specific hook functions which may be defined as 1656macros. These functions are found as function pointers in the 1657@samp{elf_backend_data} structure defined in @file{elf-bfd.h}. In 1658general, a hook function is set by defining a macro 1659@samp{elf_backend_@var{name}}. 1660 1661There are a few processor specific constants which may also be defined. 1662These are again found in the @samp{elf_backend_data} structure. 1663 1664I will not define the various functions and constants here; see the 1665comments in @file{elf-bfd.h}. 1666 1667Normally any odd characteristic of a particular ELF processor is handled 1668via a hook function. For example, the special @samp{SHN_MIPS_SCOMMON} 1669section number found in MIPS ELF is handled via the hooks 1670@samp{section_from_bfd_section}, @samp{symbol_processing}, 1671@samp{add_symbol_hook}, and @samp{output_symbol_hook}. 1672 1673Dynamic linking support, which involves processor specific relocations 1674requiring special handling, is also implemented via hook functions. 1675 1676@node BFD ELF core files 1677@subsection BFD ELF core files 1678@cindex elf core files 1679 1680On native ELF Unix systems, core files are generated without any 1681sections. Instead, they only have program segments. 1682 1683When BFD is used to read an ELF core file, the BFD sections will 1684actually represent program segments. Since ELF program segments do not 1685have names, BFD will invent names like @samp{segment@var{n}} where 1686@var{n} is a number. 1687 1688A single ELF program segment may include both an initialized part and an 1689uninitialized part. The size of the initialized part is given by the 1690@samp{p_filesz} field. The total size of the segment is given by the 1691@samp{p_memsz} field. If @samp{p_memsz} is larger than @samp{p_filesz}, 1692then the extra space is uninitialized, or, more precisely, initialized 1693to zero. 1694 1695BFD will represent such a program segment as two different sections. 1696The first, named @samp{segment@var{n}a}, will represent the initialized 1697part of the program segment. The second, named @samp{segment@var{n}b}, 1698will represent the uninitialized part. 1699 1700ELF core files store special information such as register values in 1701program segments with the type @samp{PT_NOTE}. BFD will attempt to 1702interpret the information in these segments, and will create additional 1703sections holding the information. Some of this interpretation requires 1704information found in the host header file @file{sys/procfs.h}, and so 1705will only work when BFD is built on a native system. 1706 1707BFD does not currently provide any way to create an ELF core file. In 1708general, BFD does not provide a way to create core files. The way to 1709implement this would be to write @samp{bfd_set_format} and 1710@samp{bfd_write_contents} routines for the @samp{bfd_core} type; see 1711@ref{BFD target vector format}. 1712 1713@node BFD ELF future 1714@subsection BFD ELF future 1715 1716The current dynamic linking support has too much code duplication. 1717While each processor has particular differences, much of the dynamic 1718linking support is quite similar for each processor. The GOT and PLT 1719are handled in fairly similar ways, the details of -Bsymbolic linking 1720are generally similar, etc. This code should be reworked to use more 1721generic functions, eliminating the duplication. 1722 1723Similarly, the relocation handling has too much duplication. Many of 1724the @samp{reloc_type_lookup} and @samp{info_to_howto} functions are 1725quite similar. The relocate section functions are also often quite 1726similar, both in the standard linker handling and the dynamic linker 1727handling. Many of the COFF processor specific backends share a single 1728relocate section function (@samp{_bfd_coff_generic_relocate_section}), 1729and it should be possible to do something like this for the ELF targets 1730as well. 1731 1732The appearance of the processor specific magic number in 1733@samp{prep_headers} in @file{elf.c} is somewhat bogus. It should be 1734possible to add support for a new processor without changing the generic 1735support. 1736 1737The processor function hooks and constants are ad hoc and need better 1738documentation. 1739 1740@node BFD glossary 1741@section BFD glossary 1742@cindex glossary for bfd 1743@cindex bfd glossary 1744 1745This is a short glossary of some BFD terms. 1746 1747@table @asis 1748@item a.out 1749The a.out object file format. The original Unix object file format. 1750Still used on SunOS, though not Solaris. Supports only three sections. 1751 1752@item archive 1753A collection of object files produced and manipulated by the @samp{ar} 1754program. 1755 1756@item backend 1757The implementation within BFD of a particular object file format. The 1758set of functions which appear in a particular target vector. 1759 1760@item BFD 1761The BFD library itself. Also, each object file, archive, or executable 1762opened by the BFD library has the type @samp{bfd *}, and is sometimes 1763referred to as a bfd. 1764 1765@item COFF 1766The Common Object File Format. Used on Unix SVR3. Used by some 1767embedded targets, although ELF is normally better. 1768 1769@item DLL 1770A shared library on Windows. 1771 1772@item dynamic linker 1773When a program linked against a shared library is run, the dynamic 1774linker will locate the appropriate shared library and arrange to somehow 1775include it in the running image. 1776 1777@item dynamic object 1778Another name for an ELF shared library. 1779 1780@item ECOFF 1781The Extended Common Object File Format. Used on Alpha Digital Unix 1782(formerly OSF/1), as well as Ultrix and Irix 4. A variant of COFF. 1783 1784@item ELF 1785The Executable and Linking Format. The object file format used on most 1786modern Unix systems, including GNU/Linux, Solaris, Irix, and SVR4. Also 1787used on many embedded systems. 1788 1789@item executable 1790A program, with instructions and symbols, and perhaps dynamic linking 1791information. Normally produced by a linker. 1792 1793@item LMA 1794Load Memory Address. This is the address at which a section will be 1795loaded. Compare with VMA, below. 1796 1797@item object file 1798A binary file including machine instructions, symbols, and relocation 1799information. Normally produced by an assembler. 1800 1801@item object file format 1802The format of an object file. Typically object files and executables 1803for a particular system are in the same format, although executables 1804will not contain any relocation information. 1805 1806@item PE 1807The Portable Executable format. This is the object file format used for 1808Windows (specifically, Win32) object files. It is based closely on 1809COFF, but has a few significant differences. 1810 1811@item PEI 1812The Portable Executable Image format. This is the object file format 1813used for Windows (specifically, Win32) executables. It is very similar 1814to PE, but includes some additional header information. 1815 1816@item relocations 1817Information used by the linker to adjust section contents. Also called 1818relocs. 1819 1820@item section 1821Object files and executable are composed of sections. Sections have 1822optional data and optional relocation information. 1823 1824@item shared library 1825A library of functions which may be used by many executables without 1826actually being linked into each executable. There are several different 1827implementations of shared libraries, each having slightly different 1828features. 1829 1830@item symbol 1831Each object file and executable may have a list of symbols, often 1832referred to as the symbol table. A symbol is basically a name and an 1833address. There may also be some additional information like the type of 1834symbol, although the type of a symbol is normally something simple like 1835function or object, and should be confused with the more complex C 1836notion of type. Typically every global function and variable in a C 1837program will have an associated symbol. 1838 1839@item target vector 1840A set of functions which implement support for a particular object file 1841format. The @samp{bfd_target} structure. 1842 1843@item Win32 1844The current Windows API, implemented by Windows 95 and later and Windows 1845NT 3.51 and later, but not by Windows 3.1. 1846 1847@item XCOFF 1848The eXtended Common Object File Format. Used on AIX. A variant of 1849COFF, with a completely different symbol table implementation. 1850 1851@item VMA 1852Virtual Memory Address. This is the address a section will have when 1853an executable is run. Compare with LMA, above. 1854@end table 1855 1856@node Index 1857@unnumberedsec Index 1858@printindex cp 1859 1860@contents 1861@bye 1862