1.\" Copyright (c) 1991, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" This man page is derived from documentation contributed to Berkeley by 5.\" Donn Seeley at UUNET Technologies, Inc. 6.\" 7.\" %sccs.include.redist.roff% 8.\" 9.\" @(#)a.out.5 8.2 (Berkeley) 06/01/94 10.\" 11.Dd 12.Dt A.OUT 5 13.Os 14.Sh NAME 15.Nm a.out 16.Nd format of executable binary files 17.Sh SYNOPSIS 18.Fd #include <a.out.h> 19.Sh DESCRIPTION 20The include file 21.Aq Pa a.out.h 22declares three structures and several macros. 23The structures describe the format of 24executable machine code files 25.Pq Sq binaries 26on the system. 27.Pp 28A binary file consists of up to 7 sections. 29In order, these sections are: 30.Bl -tag -width "text relocations" 31.It exec header 32Contains parameters used by the kernel 33to load a binary file into memory and execute it, 34and by the link editor 35.Xr ld 1 36to combine a binary file with other binary files. 37This section is the only mandatory one. 38.It text segment 39Contains machine code and related data 40that are loaded into memory when a program executes. 41May be loaded read-only. 42.It data segment 43Contains initialized data; always loaded into writable memory. 44.It text relocations 45Contains records used by the link editor 46to update pointers in the text segment when combining binary files. 47.It data relocations 48Like the text relocation section, but for data segment pointers. 49.It symbol table 50Contains records used by the link editor 51to cross reference the addresses of named variables and functions 52.Pq Sq symbols 53between binary files. 54.It string table 55Contains the character strings corresponding to the symbol names. 56.El 57.Pp 58Every binary file begins with an 59.Fa exec 60structure: 61.Bd -literal -offset indent 62struct exec { 63 unsigned short a_mid; 64 unsigned short a_magic; 65 unsigned long a_text; 66 unsigned long a_data; 67 unsigned long a_bss; 68 unsigned long a_syms; 69 unsigned long a_entry; 70 unsigned long a_trsize; 71 unsigned long a_drsize; 72}; 73.Ed 74.Pp 75The fields have the following functions: 76.Bl -tag -width a_trsize 77.It Fa a_mid 78Contains a bit pattern that 79identifies binaries that were built for 80certain sub-classes of an architecture 81.Pq Sq machine IDs 82or variants of the operating system on a given architecture. 83The kernel may not support all machine IDs 84on a given architecture. 85The 86.Fa a_mid 87field is not present on some architectures; 88in this case, the 89.Fa a_magic 90field has type 91.Em unsigned long . 92.It Fa a_magic 93Contains a bit pattern 94.Pq Sq magic number 95that uniquely identifies binary files 96and distinguishes different loading conventions. 97The field must contain one of the following values: 98.Bl -tag -width ZMAGIC 99.ne 1i 100.It Dv OMAGIC 101The text and data segments immediately follow the header 102and are contiguous. 103The kernel loads both text and data segments into writable memory. 104.It Dv NMAGIC 105As with 106.Dv OMAGIC , 107text and data segments immediately follow the header and are contiguous. 108However, the kernel loads the text into read-only memory 109and loads the data into writable memory at the next 110page boundary after the text. 111.It Dv ZMAGIC 112The kernel loads individual pages on demand from the binary. 113The header, text segment and data segment are all 114padded by the link editor to a multiple of the page size. 115Pages that the kernel loads from the text segment are read-only, 116while pages from the data segment are writable. 117.El 118.It Fa a_text 119Contains the size of the text segment in bytes. 120.It Fa a_data 121Contains the size of the data segment in bytes. 122.It Fa a_bss 123Contains the number of bytes in the 124.Sq bss segment 125and is used by the kernel to set the initial break 126.Pq Xr brk 2 127after the data segment. 128The kernel loads the program so that this amount of writable memory 129appears to follow the data segment and initially reads as zeroes. 130.It Fa a_syms 131Contains the size in bytes of the symbol table section. 132.It Fa a_entry 133Contains the address in memory of the entry point 134of the program after the kernel has loaded it; 135the kernel starts the execution of the program 136from the machine instruction at this address. 137.It Fa a_trsize 138Contains the size in bytes of the text relocation table. 139.It Fa a_drsize 140Contains the size in bytes of the data relocation table. 141.El 142.Pp 143The 144.Pa a.out.h 145include file defines several macros which use an 146.Fa exec 147structure to test consistency or to locate section offsets in the binary file. 148.Bl -tag -width N_BADMAG(exec) 149.It Fn N_BADMAG exec 150Nonzero if the 151.Fa a_magic 152field does not contain a recognized value. 153.It Fn N_TXTOFF exec 154The byte offset in the binary file of the beginning of the text segment. 155.It Fn N_SYMOFF exec 156The byte offset of the beginning of the symbol table. 157.It Fn N_STROFF exec 158The byte offset of the beginning of the string table. 159.El 160.Pp 161Relocation records have a standard format which 162is described by the 163.Fa relocation_info 164structure: 165.Bd -literal -offset indent 166struct relocation_info { 167 int r_address; 168 unsigned int r_symbolnum : 24, 169 r_pcrel : 1, 170 r_length : 2, 171 r_extern : 1, 172 : 4; 173}; 174.Ed 175.Pp 176The 177.Fa relocation_info 178fields are used as follows: 179.Bl -tag -width r_symbolnum 180.It Fa r_address 181Contains the byte offset of a pointer that needs to be link-edited. 182Text relocation offsets are reckoned from the start of the text segment, 183and data relocation offsets from the start of the data segment. 184The link editor adds the value that is already stored at this offset 185into the new value that it computes using this relocation record. 186.ne 1i 187.It Fa r_symbolnum 188Contains the ordinal number of a symbol structure 189in the symbol table (it is 190.Em not 191a byte offset). 192After the link editor resolves the absolute address for this symbol, 193it adds that address to the pointer that is undergoing relocation. 194(If the 195.Fa r_extern 196bit is clear, the situation is different; see below.) 197.It Fa r_pcrel 198If this is set, 199the link editor assumes that it is updating a pointer 200that is part of a machine code instruction using pc-relative addressing. 201The address of the relocated pointer is implicitly added 202to its value when the running program uses it. 203.It Fa r_length 204Contains the log base 2 of the length of the pointer in bytes; 2050 for 1-byte displacements, 1 for 2-byte displacements, 2062 for 4-byte displacements. 207.It Fa r_extern 208Set if this relocation requires an external reference; 209the link editor must use a symbol address to update the pointer. 210When the 211.Fa r_extern 212bit is clear, the relocation is 213.Sq local ; 214the link editor updates the pointer to reflect 215changes in the load addresses of the various segments, 216rather than changes in the value of a symbol. 217In this case, the content of the 218.Fa r_symbolnum 219field is an 220.Fa n_type 221value (see below); 222this type field tells the link editor 223what segment the relocated pointer points into. 224.El 225.Pp 226Symbols map names to addresses (or more generally, strings to values). 227Since the link-editor adjusts addresses, 228a symbol's name must be used to stand for its address 229until an absolute value has been assigned. 230Symbols consist of a fixed-length record in the symbol table 231and a variable-length name in the string table. 232The symbol table is an array of 233.Fa nlist 234structures: 235.Bd -literal -offset indent 236struct nlist { 237 union { 238 char *n_name; 239 long n_strx; 240 } n_un; 241 unsigned char n_type; 242 char n_other; 243 short n_desc; 244 unsigned long n_value; 245}; 246.Ed 247.Pp 248The fields are used as follows: 249.Bl -tag -width n_un.n_strx 250.It Fa n_un.n_strx 251Contains a byte offset into the string table 252for the name of this symbol. 253When a program accesses a symbol table with the 254.Xr nlist 3 255function, 256this field is replaced with the 257.Fa n_un.n_name 258field, which is a pointer to the string in memory. 259.It Fa n_type 260Used by the link editor to determine 261how to update the symbol's value. 262The 263.Fa n_type 264field is broken down into three sub-fields using bitmasks. 265The link editor treats symbols with the 266.Dv N_EXT 267type bit set as 268.Sq external 269symbols and permits references to them from other binary files. 270The 271.Dv N_TYPE 272mask selects bits of interest to the link editor: 273.Bl -tag -width N_TEXT 274.It Dv N_UNDF 275An undefined symbol. 276The link editor must locate an external symbol with the same name 277in another binary file to determine the absolute value of this symbol. 278As a special case, if the 279.Fa n_value 280field is nonzero and no binary file in the link-edit defines this symbol, 281the link-editor will resolve this symbol to an address 282in the bss segment, 283reserving an amount of bytes equal to 284.Fa n_value . 285If this symbol is undefined in more than one binary file 286and the binary files do not agree on the size, 287the link editor chooses the greatest size found across all binaries. 288.It Dv N_ABS 289An absolute symbol. 290The link editor does not update an absolute symbol. 291.It Dv N_TEXT 292A text symbol. 293This symbol's value is a text address and 294the link editor will update it when it merges binary files. 295.It Dv N_DATA 296A data symbol; similar to 297.Dv N_TEXT 298but for data addresses. 299The values for text and data symbols are not file offsets but 300addresses; to recover the file offsets, it is necessary 301to identify the loaded address of the beginning of the corresponding 302section and subtract it, then add the offset of the section. 303.It Dv N_BSS 304A bss symbol; like text or data symbols but 305has no corresponding offset in the binary file. 306.It Dv N_FN 307A filename symbol. 308The link editor inserts this symbol before 309the other symbols from a binary file when 310merging binary files. 311The name of the symbol is the filename given to the link editor, 312and its value is the first text address from that binary file. 313Filename symbols are not needed for link-editing or loading, 314but are useful for debuggers. 315.El 316.Pp 317The 318.Dv N_STAB 319mask selects bits of interest to symbolic debuggers 320such as 321.Xr gdb 1 ; 322the values are described in 323.Xr stab 5 . 324.It Fa n_other 325This field is currently unused. 326.It Fa n_desc 327Reserved for use by debuggers; passed untouched by the link editor. 328Different debuggers use this field for different purposes. 329.It Fa n_value 330Contains the value of the symbol. 331For text, data and bss symbols, this is an address; 332for other symbols (such as debugger symbols), 333the value may be arbitrary. 334.El 335.Pp 336The string table consists of an 337.Em unsigned long 338length followed by null-terminated symbol strings. 339The length represents the size of the entire table in bytes, 340so its minimum value (or the offset of the first string) 341is always 4 on 32-bit machines. 342.Sh SEE ALSO 343.Xr ld 1 , 344.Xr execve 2 , 345.Xr nlist 3 , 346.Xr core 5 , 347.Xr dbx 5 , 348.Xr stab 5 349.Sh HISTORY 350The 351.Pa a.out.h 352include file appeared in 353.At v7 . 354.Sh BUGS 355Since not all of the supported architectures use the 356.Fa a_mid 357field, 358it can be difficult to determine what 359architecture a binary will execute on 360without examining its actual machine code. 361Even with a machine identifier, 362the byte order of the 363.Fa exec 364header is machine-dependent. 365.Pp 366Nobody seems to agree on what 367.Em bss 368stands for. 369.Pp 370New binary file formats may be supported in the future, 371and they probably will not be compatible at any level 372with this ancient format. 373