xref: /original-bsd/share/man/man8/man8.hp300/crash.8 (revision 333da485)
1.\" Copyright (c) 1990, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" %sccs.include.redist.man%
5.\"
6.\"     @(#)crash.8	8.2 (Berkeley) 01/12/94
7.\"
8.Dd
9.Dt CRASH 8 hp300
10.Os
11.Sh NAME
12.Nm crash
13.Nd UNIX system failures
14.Sh DESCRIPTION
15This section explains a bit about system crashes
16and (very briefly) how to analyze crash dumps.
17.Pp
18When the system crashes voluntarily it prints a message of the form
19.Bd -ragged -offset indent
20panic: why i gave up the ghost
21.Ed
22.Pp
23on the console, takes a dump on a mass storage peripheral,
24and then invokes an automatic reboot procedure as
25described in
26.Xr reboot 8 .
27Unless some unexpected inconsistency is encountered in the state
28of the file systems due to hardware or software failure, the system
29will then resume multi-user operations.
30.Pp
31The system has a large number of internal consistency checks; if one
32of these fails, then it will panic with a very short message indicating
33which one failed.
34In many instances, this will be the name of the routine which detected
35the error, or a two-word description of the inconsistency.
36A full understanding of most panic messages requires perusal of the
37source code for the system.
38.Pp
39The most common cause of system failures is hardware failure, which
40can reflect itself in different ways.  Here are the messages which
41are most likely, with some hints as to causes.
42Left unstated in all cases is the possibility that hardware or software
43error produced the message in some unexpected way.
44.Pp
45.Bl -tag -width Ds -compact
46.It Sy iinit
47This cryptic panic message results from a failure to mount the root filesystem
48during the bootstrap process.
49Either the root filesystem has been corrupted,
50or the system is attempting to use the wrong device as root filesystem.
51Usually, an alternate copy of the system binary or an alternate root
52filesystem can be used to bring up the system to investigate.
53.Pp
54.It Sy "Can't exec /etc/init"
55This is not a panic message, as reboots are likely to be futile.
56Late in the bootstrap procedure, the system was unable to locate
57and execute the initialization process,
58.Xr init 8 .
59The root filesystem is incorrect or has been corrupted, or the mode
60or type of
61.Pa /etc/init
62forbids execution.
63.Pp
64.It Sy "IO err in push"
65.It Sy "hard IO err in swap"
66The system encountered an error trying to write to the paging device
67or an error in reading critical information from a disk drive.
68The offending disk should be fixed if it is broken or unreliable.
69.Pp
70.It Sy "realloccg: bad optim"
71.It Sy "ialloc: dup alloc"
72.It Sy "alloccgblk:cyl groups corrupted"
73.It Sy "ialloccg: map corrupted"
74.It Sy "free: freeing free block"
75.It Sy "free: freeing free frag"
76.It Sy "ifree: freeing free inode"
77.It Sy "alloccg: map corrupted"
78These panic messages are among those that may be produced
79when filesystem inconsistencies are detected.
80The problem generally results from a failure to repair damaged filesystems
81after a crash, hardware failures, or other condition that should not
82normally occur.
83A filesystem check will normally correct the problem.
84.Pp
85.It Sy "timeout table overflow"
86This really shouldn't be a panic, but until the data structure
87involved is made to be extensible, running out of entries causes a crash.
88If this happens, make the timeout table bigger.
89.Pp
90.It Sy "trap type %d, code = %x, v = %x"
91An unexpected trap has occurred within the system; the trap types are:
92.Bl -column xxxx -offset indent
930	bus error
941	address error
952	illegal instruction
963	divide by zero
97.No 4\t Em chk No instruction
98.No 5\t Em trapv No instruction
996	privileged instruction
1007	trace trap
1018	MMU fault
1029	simulated software interrupt
10310	format error
10411	FP coprocessor fault
10512	coprocessor fault
10613	simulated AST
107.El
108.Pp
109The favorite trap type in system crashes is trap type 8,
110indicating a wild reference.
111``code'' (hex) is the concatenation of the
112MMU
113status register
114(see <hp300/cpu.h>)
115in the high 16 bits and the 68020 special status word
116(see the 68020 manual, page 6-17)
117in the low 16.
118``v'' (hex) is the virtual address which caused the fault.
119Additionally, the kernel will dump about a screenful of semi-useful
120information.
121``pid'' (decimal) is the process id of the process running at the
122time of the exception.
123Note that if we panic in an interrupt routine,
124this process may not be related to the panic.
125``ps'' (hex) is the 68020 processor status register ``ps''.
126``pc'' (hex) is the value of the program counter saved
127on the hardware exception frame.
128It may
129.Em not
130be the PC of the instruction causing the fault.
131``sfc'' and ``dfc'' (hex) are the 68020 source/destination function codes.
132They should always be one.
133``p0'' and ``p1'' are the
134VAX-like
135region registers.
136They are of the form:
137.Pp
138.Bd -ragged -offset indent
139<length> '@' <kernel VA>
140.Ed
141.Pp
142where both are in hex.
143Following these values are a dump of the processor registers (hex).
144Finally, is a dump of the stack (user/kernel) at the time of the offense.
145.Pp
146.It Sy "init died"
147The system initialization process has exited.  This is bad news, as no new
148users will then be able to log in.  Rebooting is the only fix, so the
149system just does it right away.
150.Pp
151.It Sy "out of mbufs: map full"
152The network has exhausted its private page map for network buffers.
153This usually indicates that buffers are being lost, and rather than
154allow the system to slowly degrade, it reboots immediately.
155The map may be made larger if necessary.
156.El
157.Pp
158That completes the list of panic types you are likely to see.
159.Pp
160When the system crashes it writes (or at least attempts to write)
161an image of memory into the back end of the dump device,
162usually the same as the primary swap
163area.  After the system is rebooted, the program
164.Xr savecore 8
165runs and preserves a copy of this core image and the current
166system in a specified directory for later perusal.  See
167.Xr savecore 8
168for details.
169.Pp
170To analyze a dump you should begin by running
171.Xr adb 1
172with the
173.Fl k
174flag on the system load image and core dump.
175If the core image is the result of a panic,
176the panic message is printed.
177Normally the command
178``$c''
179will provide a stack trace from the point of
180the crash and this will provide a clue as to
181what went wrong.
182For more details consult
183.%T "Using ADB to Debug the UNIX Kernel" .
184.Sh SEE ALSO
185.Xr adb 1 ,
186.Xr reboot 8
187.Rs
188.%T "MC68020 32-bit Microprocessor User's Manual"
189.Re
190.Rs
191.%T "Using ADB to Debug the UNIX Kernel
192.Re
193.Rs
194.%T "4.3BSD for the HP300"
195.Re
196.Sh HISTORY
197A
198.Nm
199man page appeared in Version 6 AT&T UNIX.
200