xref: /386bsd/usr/src/share/man/man5/a.out.5 (revision a2142627)
1.\" Copyright (c) 1991 The Regents of the University of California.
2.\" All rights reserved.
3.\"
4.\" This man page is derived from documentation contributed to Berkeley by
5.\" Donn Seeley at UUNET Technologies, Inc.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\" 3. All advertising materials mentioning features or use of this software
16.\"    must display the following acknowledgement:
17.\"	This product includes software developed by the University of
18.\"	California, Berkeley and its contributors.
19.\" 4. Neither the name of the University nor the names of its contributors
20.\"    may be used to endorse or promote products derived from this software
21.\"    without specific prior written permission.
22.\"
23.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33.\" SUCH DAMAGE.
34.\"
35.\"	@(#)a.out.5	6.3 (Berkeley) 4/29/91
36.\"
37.Dd April 29, 1991
38.Dt A.OUT 5
39.Os
40.Sh NAME
41.Nm a.out
42.Nd format of executable binary files
43.Sh SYNOPSIS
44.Fd #include <a.out.h>
45.Sh DESCRIPTION
46The include file
47.Aq Pa a.out.h
48declares three structures and several macros.
49The structures describe the format of
50executable machine code files
51.Pq Sq binaries
52on the system.
53.Pp
54A binary file consists of up to 7 sections.
55In order, these sections are:
56.Bl -tag -width "text relocations"
57.It exec header
58Contains parameters used by the kernel
59to load a binary file into memory and execute it,
60and by the link editor
61.Xr ld 1
62to combine a binary file with other binary files.
63This section is the only mandatory one.
64.It text segment
65Contains machine code and related data
66that are loaded into memory when a program executes.
67May be loaded read-only.
68.It data segment
69Contains initialized data; always loaded into writable memory.
70.It text relocations
71Contains records used by the link editor
72to update pointers in the text segment when combining binary files.
73.It data relocations
74Like the text relocation section, but for data segment pointers.
75.It symbol table
76Contains records used by the link editor
77to cross reference the addresses of named variables and functions
78.Pq Sq symbols
79between binary files.
80.It string table
81Contains the character strings corresponding to the symbol names.
82.El
83.Pp
84Every binary file begins with an
85.Fa exec
86structure:
87.Bd -literal -offset indent
88struct exec {
89	unsigned short	a_mid;
90	unsigned short	a_magic;
91	unsigned long	a_text;
92	unsigned long	a_data;
93	unsigned long	a_bss;
94	unsigned long	a_syms;
95	unsigned long	a_entry;
96	unsigned long	a_trsize;
97	unsigned long	a_drsize;
98};
99.Ed
100.Pp
101The fields have the following functions:
102.Bl -tag -width a_trsize
103.It Fa a_mid
104Contains a bit pattern that
105identifies binaries that were built for
106certain sub-classes of an architecture
107.Pq Sq machine IDs
108or variants of the operating system on a given architecture.
109The kernel may not support all machine IDs
110on a given architecture.
111The
112.Fa a_mid
113field is not present on some architectures;
114in this case, the
115.Fa a_magic
116field has type
117.Em unsigned long .
118.It Fa a_magic
119Contains a bit pattern
120.Pq Sq magic number
121that uniquely identifies binary files
122and distinguishes different loading conventions.
123The field must contain one of the following values:
124.Bl -tag -width ZMAGIC
125.It Dv OMAGIC
126The text and data segments immediately follow the header
127and are contiguous.
128The kernel loads both text and data segments into writable memory.
129.It Dv NMAGIC
130As with
131.Dv OMAGIC ,
132text and data segments immediately follow the header and are contiguous.
133However, the kernel loads the text into read-only memory
134and loads the data into writable memory at the next
135page boundary after the text.
136.It Dv ZMAGIC
137The kernel loads individual pages on demand from the binary.
138The header, text segment and data segment are all
139padded by the link editor to a multiple of the page size.
140Pages that the kernel loads from the text segment are read-only,
141while pages from the data segment are writable.
142.El
143.It Fa a_text
144Contains the size of the text segment in bytes.
145.It Fa a_data
146Contains the size of the data segment in bytes.
147.It Fa a_bss
148Contains the number of bytes in the
149.Sq bss segment
150and is used by the kernel to set the initial break
151.Pq Xr brk 2
152after the data segment.
153The kernel loads the program so that this amount of writable memory
154appears to follow the data segment and initially reads as zeroes.
155.It Fa a_syms
156Contains the size in bytes of the symbol table section.
157.It Fa a_entry
158Contains the address in memory of the entry point
159of the program after the kernel has loaded it;
160the kernel starts the execution of the program
161from the machine instruction at this address.
162.It Fa a_trsize
163Contains the size in bytes of the text relocation table.
164.It Fa a_drsize
165Contains the size in bytes of the data relocation table.
166.El
167.Pp
168The
169.Pa a.out.h
170include file defines several macros which use an
171.Fa exec
172structure to test consistency or to locate section offsets in the binary file.
173.Bl -tag -width N_BADMAG(exec)
174.It Fn N_BADMAG exec
175Nonzero if the
176.Fa a_magic
177field does not contain a recognized value.
178.It Fn N_TXTOFF exec
179The byte offset in the binary file of the beginning of the text segment.
180.It Fn N_SYMOFF exec
181The byte offset of the beginning of the symbol table.
182.It Fn N_STROFF exec
183The byte offset of the beginning of the string table.
184.El
185.Pp
186Relocation records have a standard format which
187is described by the
188.Fa relocation_info
189structure:
190.Bd -literal -offset indent
191struct relocation_info {
192	int		r_address;
193	unsigned int	r_symbolnum : 24,
194			r_pcrel : 1,
195			r_length : 2,
196			r_extern : 1,
197			: 4;
198};
199.Ed
200.Pp
201The
202.Fa relocation_info
203fields are used as follows:
204.Bl -tag -width r_symbolnum
205.It Fa r_address
206Contains the byte offset of a pointer that needs to be link-edited.
207Text relocation offsets are reckoned from the start of the text segment,
208and data relocation offsets from the start of the data segment.
209The link editor adds the value that is already stored at this offset
210into the new value that it computes using this relocation record.
211.It Fa r_symbolnum
212Contains the ordinal number of a symbol structure
213in the symbol table (it is
214.Em not
215a byte offset).
216After the link editor resolves the absolute address for this symbol,
217it adds that address to the pointer that is undergoing relocation.
218(If the
219.Fa r_extern
220bit is clear, the situation is different; see below.)
221.It Fa r_pcrel
222If this is set,
223the link editor assumes that it is updating a pointer
224that is part of a machine code instruction using pc-relative addressing.
225The address of the relocated pointer is implicitly added
226to its value when the running program uses it.
227.It Fa r_length
228Contains the log base 2 of the length of the pointer in bytes;
2290 for 1-byte displacements, 1 for 2-byte displacements,
2302 for 4-byte displacements.
231.It Fa r_extern
232Set if this relocation requires an external reference;
233the link editor must use a symbol address to update the pointer.
234When the
235.Fa r_extern
236bit is clear, the relocation is
237.Sq local ;
238the link editor updates the pointer to reflect
239changes in the load addresses of the various segments,
240rather than changes in the value of a symbol.
241In this case, the content of the
242.Fa r_symbolnum
243field is an
244.Fa n_type
245value (see below);
246this type field tells the link editor
247what segment the relocated pointer points into.
248.El
249.Pp
250Symbols map names to addresses (or more generally, strings to values).
251Since the link-editor adjusts addresses,
252a symbol's name must be used to stand for its address
253until an absolute value has been assigned.
254Symbols consist of a fixed-length record in the symbol table
255and a variable-length name in the string table.
256The symbol table is an array of
257.Fa nlist
258structures:
259.Bd -literal -offset indent
260struct nlist {
261	union {
262		char	*n_name;
263		long	n_strx;
264	} n_un;
265	unsigned char	n_type;
266	char		n_other;
267	short		n_desc;
268	unsigned long	n_value;
269};
270.Ed
271.Pp
272The fields are used as follows:
273.Bl -tag -width n_un.n_strx
274.It Fa n_un.n_strx
275Contains a byte offset into the string table
276for the name of this symbol.
277When a program accesses a symbol table with the
278.Xr nlist 3
279function,
280this field is replaced with the
281.Fa n_un.n_name
282field, which is a pointer to the string in memory.
283.It Fa n_type
284Used by the link editor to determine
285how to update the symbol's value.
286The
287.Fa n_type
288field is broken down into three sub-fields using bitmasks.
289The link editor treats symbols with the
290.Dv N_EXT
291type bit set as
292.Sq external
293symbols and permits references to them from other binary files.
294The
295.Dv N_TYPE
296mask selects bits of interest to the link editor:
297.Bl -tag -width N_TEXT
298.It Dv N_UNDF
299An undefined symbol.
300The link editor must locate an external symbol with the same name
301in another binary file to determine the absolute value of this symbol.
302As a special case, if the
303.Fa n_value
304field is nonzero and no binary file in the link-edit defines this symbol,
305the link-editor will resolve this symbol to an address
306in the bss segment,
307reserving an amount of bytes equal to
308.Fa n_value .
309If this symbol is undefined in more than one binary file
310and the binary files do not agree on the size,
311the link editor chooses the greatest size found across all binaries.
312.It Dv N_ABS
313An absolute symbol.
314The link editor does not update an absolute symbol.
315.It Dv N_TEXT
316A text symbol.
317This symbol's value is a text address and
318the link editor will update it when it merges binary files.
319.It Dv N_DATA
320A data symbol; similar to
321.Dv N_TEXT
322but for data addresses.
323The values for text and data symbols are not file offsets but
324addresses; to recover the file offsets, it is necessary
325to identify the loaded address of the beginning of the corresponding
326section and subtract it, then add the offset of the section.
327.It Dv N_BSS
328A bss symbol; like text or data symbols but
329has no corresponding offset in the binary file.
330.It Dv N_FN
331A filename symbol.
332The link editor inserts this symbol before
333the other symbols from a binary file when
334merging binary files.
335The name of the symbol is the filename given to the link editor,
336and its value is the first text address from that binary file.
337Filename symbols are not needed for link-editing or loading,
338but are useful for debuggers.
339.El
340.Pp
341The
342.Dv N_STAB
343mask selects bits of interest to symbolic debuggers
344such as
345.Xr gdb 1 ;
346the values are described in
347.Xr stab 5 .
348.It Fa n_other
349This field is currently unused.
350.It Fa n_desc
351Reserved for use by debuggers; passed untouched by the link editor.
352Different debuggers use this field for different purposes.
353.It Fa n_value
354Contains the value of the symbol.
355For text, data and bss symbols, this is an address;
356for other symbols (such as debugger symbols),
357the value may be arbitrary.
358.El
359.Pp
360The string table consists of an
361.Em unsigned long
362length followed by null-terminated symbol strings.
363The length represents the size of the entire table in bytes,
364so its minimum value (or the offset of the first string)
365is always 4 on 32-bit machines.
366.Sh SEE ALSO
367.Xr ld 1 ,
368.Xr execve 2 ,
369.Xr nlist 3 ,
370.Xr core 5 ,
371.Xr dbx 5 ,
372.Xr stab 5
373.Sh HISTORY
374The
375.Pa a.out.h
376include file appeared in
377.At v7 .
378.Sh BUGS
379Since not all of the supported architectures use the
380.Fa a_mid
381field,
382it can be difficult to determine what
383architecture a binary will execute on
384without examining its actual machine code.
385Even with a machine identifier,
386the byte order of the
387.Fa exec
388header is machine-dependent.
389.Pp
390Nobody seems to agree on what
391.Em bss
392stands for.
393.Pp
394New binary file formats may be supported in the future,
395and they probably will not be compatible at any level
396with this ancient format.
397