1.\" Copyright (c) 1991, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" This man page is derived from documentation contributed to Berkeley by 5.\" Donn Seeley at UUNET Technologies, Inc. 6.\" 7.\" %sccs.include.redist.roff% 8.\" 9.\" @(#)a.out.5 8.1 (Berkeley) 06/05/93 10.\" 11.Dd 12.Dt A.OUT 5 13.Os 14.Sh NAME 15.Nm a.out 16.Nd format of executable binary files 17.Sh SYNOPSIS 18.Fd #include <a.out.h> 19.Sh DESCRIPTION 20The include file 21.Aq Pa a.out.h 22declares three structures and several macros. 23The structures describe the format of 24executable machine code files 25.Pq Sq binaries 26on the system. 27.Pp 28A binary file consists of up to 7 sections. 29In order, these sections are: 30.Bl -tag -width "text relocations" 31.It exec header 32Contains parameters used by the kernel 33to load a binary file into memory and execute it, 34and by the link editor 35.Xr ld 1 36to combine a binary file with other binary files. 37This section is the only mandatory one. 38.It text segment 39Contains machine code and related data 40that are loaded into memory when a program executes. 41May be loaded read-only. 42.It data segment 43Contains initialized data; always loaded into writable memory. 44.It text relocations 45Contains records used by the link editor 46to update pointers in the text segment when combining binary files. 47.It data relocations 48Like the text relocation section, but for data segment pointers. 49.It symbol table 50Contains records used by the link editor 51to cross reference the addresses of named variables and functions 52.Pq Sq symbols 53between binary files. 54.It string table 55Contains the character strings corresponding to the symbol names. 56.El 57.Pp 58Every binary file begins with an 59.Fa exec 60structure: 61.Bd -literal -offset indent 62struct exec { 63 unsigned short a_mid; 64 unsigned short a_magic; 65 unsigned long a_text; 66 unsigned long a_data; 67 unsigned long a_bss; 68 unsigned long a_syms; 69 unsigned long a_entry; 70 unsigned long a_trsize; 71 unsigned long a_drsize; 72}; 73.Ed 74.Pp 75The fields have the following functions: 76.Bl -tag -width a_trsize 77.It Fa a_mid 78Contains a bit pattern that 79identifies binaries that were built for 80certain sub-classes of an architecture 81.Pq Sq machine IDs 82or variants of the operating system on a given architecture. 83The kernel may not support all machine IDs 84on a given architecture. 85The 86.Fa a_mid 87field is not present on some architectures; 88in this case, the 89.Fa a_magic 90field has type 91.Em unsigned long . 92.It Fa a_magic 93Contains a bit pattern 94.Pq Sq magic number 95that uniquely identifies binary files 96and distinguishes different loading conventions. 97The field must contain one of the following values: 98.Bl -tag -width ZMAGIC 99.It Dv OMAGIC 100The text and data segments immediately follow the header 101and are contiguous. 102The kernel loads both text and data segments into writable memory. 103.It Dv NMAGIC 104As with 105.Dv OMAGIC , 106text and data segments immediately follow the header and are contiguous. 107However, the kernel loads the text into read-only memory 108and loads the data into writable memory at the next 109page boundary after the text. 110.It Dv ZMAGIC 111The kernel loads individual pages on demand from the binary. 112The header, text segment and data segment are all 113padded by the link editor to a multiple of the page size. 114Pages that the kernel loads from the text segment are read-only, 115while pages from the data segment are writable. 116.El 117.It Fa a_text 118Contains the size of the text segment in bytes. 119.It Fa a_data 120Contains the size of the data segment in bytes. 121.It Fa a_bss 122Contains the number of bytes in the 123.Sq bss segment 124and is used by the kernel to set the initial break 125.Pq Xr brk 2 126after the data segment. 127The kernel loads the program so that this amount of writable memory 128appears to follow the data segment and initially reads as zeroes. 129.It Fa a_syms 130Contains the size in bytes of the symbol table section. 131.It Fa a_entry 132Contains the address in memory of the entry point 133of the program after the kernel has loaded it; 134the kernel starts the execution of the program 135from the machine instruction at this address. 136.It Fa a_trsize 137Contains the size in bytes of the text relocation table. 138.It Fa a_drsize 139Contains the size in bytes of the data relocation table. 140.El 141.Pp 142The 143.Pa a.out.h 144include file defines several macros which use an 145.Fa exec 146structure to test consistency or to locate section offsets in the binary file. 147.Bl -tag -width N_BADMAG(exec) 148.It Fn N_BADMAG exec 149Nonzero if the 150.Fa a_magic 151field does not contain a recognized value. 152.It Fn N_TXTOFF exec 153The byte offset in the binary file of the beginning of the text segment. 154.It Fn N_SYMOFF exec 155The byte offset of the beginning of the symbol table. 156.It Fn N_STROFF exec 157The byte offset of the beginning of the string table. 158.El 159.Pp 160Relocation records have a standard format which 161is described by the 162.Fa relocation_info 163structure: 164.Bd -literal -offset indent 165struct relocation_info { 166 int r_address; 167 unsigned int r_symbolnum : 24, 168 r_pcrel : 1, 169 r_length : 2, 170 r_extern : 1, 171 : 4; 172}; 173.Ed 174.Pp 175The 176.Fa relocation_info 177fields are used as follows: 178.Bl -tag -width r_symbolnum 179.It Fa r_address 180Contains the byte offset of a pointer that needs to be link-edited. 181Text relocation offsets are reckoned from the start of the text segment, 182and data relocation offsets from the start of the data segment. 183The link editor adds the value that is already stored at this offset 184into the new value that it computes using this relocation record. 185.It Fa r_symbolnum 186Contains the ordinal number of a symbol structure 187in the symbol table (it is 188.Em not 189a byte offset). 190After the link editor resolves the absolute address for this symbol, 191it adds that address to the pointer that is undergoing relocation. 192(If the 193.Fa r_extern 194bit is clear, the situation is different; see below.) 195.It Fa r_pcrel 196If this is set, 197the link editor assumes that it is updating a pointer 198that is part of a machine code instruction using pc-relative addressing. 199The address of the relocated pointer is implicitly added 200to its value when the running program uses it. 201.It Fa r_length 202Contains the log base 2 of the length of the pointer in bytes; 2030 for 1-byte displacements, 1 for 2-byte displacements, 2042 for 4-byte displacements. 205.It Fa r_extern 206Set if this relocation requires an external reference; 207the link editor must use a symbol address to update the pointer. 208When the 209.Fa r_extern 210bit is clear, the relocation is 211.Sq local ; 212the link editor updates the pointer to reflect 213changes in the load addresses of the various segments, 214rather than changes in the value of a symbol. 215In this case, the content of the 216.Fa r_symbolnum 217field is an 218.Fa n_type 219value (see below); 220this type field tells the link editor 221what segment the relocated pointer points into. 222.El 223.Pp 224Symbols map names to addresses (or more generally, strings to values). 225Since the link-editor adjusts addresses, 226a symbol's name must be used to stand for its address 227until an absolute value has been assigned. 228Symbols consist of a fixed-length record in the symbol table 229and a variable-length name in the string table. 230The symbol table is an array of 231.Fa nlist 232structures: 233.Bd -literal -offset indent 234struct nlist { 235 union { 236 char *n_name; 237 long n_strx; 238 } n_un; 239 unsigned char n_type; 240 char n_other; 241 short n_desc; 242 unsigned long n_value; 243}; 244.Ed 245.Pp 246The fields are used as follows: 247.Bl -tag -width n_un.n_strx 248.It Fa n_un.n_strx 249Contains a byte offset into the string table 250for the name of this symbol. 251When a program accesses a symbol table with the 252.Xr nlist 3 253function, 254this field is replaced with the 255.Fa n_un.n_name 256field, which is a pointer to the string in memory. 257.It Fa n_type 258Used by the link editor to determine 259how to update the symbol's value. 260The 261.Fa n_type 262field is broken down into three sub-fields using bitmasks. 263The link editor treats symbols with the 264.Dv N_EXT 265type bit set as 266.Sq external 267symbols and permits references to them from other binary files. 268The 269.Dv N_TYPE 270mask selects bits of interest to the link editor: 271.Bl -tag -width N_TEXT 272.It Dv N_UNDF 273An undefined symbol. 274The link editor must locate an external symbol with the same name 275in another binary file to determine the absolute value of this symbol. 276As a special case, if the 277.Fa n_value 278field is nonzero and no binary file in the link-edit defines this symbol, 279the link-editor will resolve this symbol to an address 280in the bss segment, 281reserving an amount of bytes equal to 282.Fa n_value . 283If this symbol is undefined in more than one binary file 284and the binary files do not agree on the size, 285the link editor chooses the greatest size found across all binaries. 286.It Dv N_ABS 287An absolute symbol. 288The link editor does not update an absolute symbol. 289.It Dv N_TEXT 290A text symbol. 291This symbol's value is a text address and 292the link editor will update it when it merges binary files. 293.It Dv N_DATA 294A data symbol; similar to 295.Dv N_TEXT 296but for data addresses. 297The values for text and data symbols are not file offsets but 298addresses; to recover the file offsets, it is necessary 299to identify the loaded address of the beginning of the corresponding 300section and subtract it, then add the offset of the section. 301.It Dv N_BSS 302A bss symbol; like text or data symbols but 303has no corresponding offset in the binary file. 304.It Dv N_FN 305A filename symbol. 306The link editor inserts this symbol before 307the other symbols from a binary file when 308merging binary files. 309The name of the symbol is the filename given to the link editor, 310and its value is the first text address from that binary file. 311Filename symbols are not needed for link-editing or loading, 312but are useful for debuggers. 313.El 314.Pp 315The 316.Dv N_STAB 317mask selects bits of interest to symbolic debuggers 318such as 319.Xr gdb 1 ; 320the values are described in 321.Xr stab 5 . 322.It Fa n_other 323This field is currently unused. 324.It Fa n_desc 325Reserved for use by debuggers; passed untouched by the link editor. 326Different debuggers use this field for different purposes. 327.It Fa n_value 328Contains the value of the symbol. 329For text, data and bss symbols, this is an address; 330for other symbols (such as debugger symbols), 331the value may be arbitrary. 332.El 333.Pp 334The string table consists of an 335.Em unsigned long 336length followed by null-terminated symbol strings. 337The length represents the size of the entire table in bytes, 338so its minimum value (or the offset of the first string) 339is always 4 on 32-bit machines. 340.Sh SEE ALSO 341.Xr ld 1 , 342.Xr execve 2 , 343.Xr nlist 3 , 344.Xr core 5 , 345.Xr dbx 5 , 346.Xr stab 5 347.Sh HISTORY 348The 349.Pa a.out.h 350include file appeared in 351.At v7 . 352.Sh BUGS 353Since not all of the supported architectures use the 354.Fa a_mid 355field, 356it can be difficult to determine what 357architecture a binary will execute on 358without examining its actual machine code. 359Even with a machine identifier, 360the byte order of the 361.Fa exec 362header is machine-dependent. 363.Pp 364Nobody seems to agree on what 365.Em bss 366stands for. 367.Pp 368New binary file formats may be supported in the future, 369and they probably will not be compatible at any level 370with this ancient format. 371