1$NetBSD: Debug.tips,v 1.3 2001/07/22 13:34:04 wiz Exp $ 2 3NOTE: this description applies to the hp300 system with the old BSD 4virtual memory system. It has not been updated to reflect the new, 5Mach-derived VM system, but should still be useful. 6The new system has no fixed-address "u.", but has a fixed mapping 7for the kernel stack at 0xfff00000. 8 9-------------------------------------------------------------------------- 10 11Some quick notes on the HPBSD VM layout and kernel debugging. 12 13Physical memory: 14 15Physical memory always ends at the top of the 32 bit address space; i.e. the 16last addressible byte is at 0xFFFFFFFF. Hence, the start of physical memory 17varies depending on how much memory is installed. The kernel variable "lowram" 18contains the starting locatation of memory as provided by the ROM. 19 20The low 128k (I think) of the physical address space is occupied by the ROM. 21This is accessible via /dev/mem *only* if the kernel is compiled with DEBUG. 22[ Maybe it should always be accessible? ] 23 24Virtual address spaces: 25 26The hardware page size is 4096 bytes. The hardware uses a two-level lookup. 27At the highest level is a one page segment table which maps a page table which 28maps the address space. Each 4 byte segment table entry (described in 29hp300/pte.h) contains the page number of a single page of 4 byte page table 30entries. Each PTE maps a single page of address space. Hence, each STE maps 314Mb of address space and one page containing 1024 STEs is adequate to map the 32entire 4Gb address space. 33 34Both page and segment table entries look similar. Both have the page frame 35in the upper part and control bits in the lower. This is the opposite of 36the VAX. It is easy to convert the page frame number in an STE/PTE to a 37physical address, simply mentally mask out the low 12 bits. For example 38if a PTE contains 0xFF880019, the physical memory location mapped starts at 390xFF880000. 40 41Kernel address space: 42 43The kernel resides in its own virtual address space independent of all user 44processes. When the processor is in supervisor mode (i.e. interrupt or 45exception handling) it uses the kernel virtual mapping. The kernel segment 46table is called Sysseg and is allocated statically in hp300/locore.s. The 47kernel page table is called Systab is also allocated statically in 48hp300/locore.s and consists of the usual assortment of SYSMAPs. 49The size of Systab (Syssize) depends on the configured size of the various 50maps but as currently configured is 9216 PTEs. Both segment and page tables 51are initialized at bootup in hp300/locore.s. The segment table never changes 52(except for bits maintained by the hardware). Portions of the page table 53change as needed. The kernel is mapped into the address space starting at 0. 54 55Theoretically, any address in the range 0 to Syssize * 4096 (0x2400000 as 56currently configured) is valid. However, certain addresses are more common 57in dumps than others. Those are (for the current configuration): 58 59 0 - 0x800000 kernel text and permanent data structures 60 0x917000 - 0x91a000 u-area; 1st page is user struct, last k-stack 61 0x1b1b000 - 0x2400000 user page tables, also kmem_alloc()ed data 62 63User address space: 64 65The user text and data are loaded starting at VA 0. The user's stack starts 66at 0xFFF00000 and grows toward lower addresses. The pages above the user 67stack are used by the kernel. From 0xFFF00000 to 0xFFF03000 is the u-area. 68The 3 PTEs for this range map (read-only) the same memory as does 0x917000 69to 0x91a000 in the kernel address space. This address range is never used 70by the kernel, but exists for utilities that assume that the u-area sits 71above the user stack. The pages from FFF03000 up are not used. They 72exist so that the user stack is in the same location as in HPUX. 73 74The user segment table is allocated along with the page tables from Usrptmap. 75They are contiguous in kernel VA space with the page tables coming before 76the segment table. Hence, a process has p_szpt+1 pages allocated starting 77at kernel VA p_p0br. 78 79The user segment table is typically very sparse since each entry maps 4Mb. 80There are usually only two valid STEs, one at the start mapping the text/data 81potion of the page table, and one at the end mapping the stack/u-area. For 82example if the segment table was at 0xFFFFA000 there would be valid entries 83at 0xFFFFA000 and 0xFFFFAFFC. 84 85Random notes: 86 87An important thing to note is that there are no hardware length registers 88on the HP. This implies that we cannot "pack" data and stack PTEs into the 89same page table page. Hence, every user page table has at least 2 pages 90(3 if you count the segment table). 91 92The HP maintains the p0br/p0lr and p1br/p1lr PCB fields the same as the 93VAX even though they have no meaning to the hardware. This also keeps many 94utilities happy. 95 96There is no separate interrupt stack (right now) on the HPs. Interrupt 97processing is handled on the kernel stack of the "current" process. 98 99Following is a list of things you might want to be able to do with a kernel 100core dump. One thing you should always have is a ps listing from the core 101file. Just do: 102 103 ps klaw vmunix.? vmcore.? 104 105Exception related panics (i.e. those detected in hp300/trap.c) will dump 106out various useful information before panicing. If available, you should 107get this out of the /usr/adm/messages file. Finally, you should be in adb: 108 109 adb -k vmunix.? vmcore.? 110 111Adb -k will allow you to examine the kernel address space more easily. 112It automatically maps kernel VAs in the range 0 to 0x2400000 to physical 113addresses. Since the kernel and user address spaces overlap (i.e. both 114start at 0), adb can't let you examine the address space of the "current" 115process as it does on the VAX. 116-------- 117 1181. Find out what the current process was at the time of the crash: 119 120If you have the dump info from /usr/adm/messages, it should contain the 121PID of the active process. If you don't have this info you can just look 122at location "Umap". This is the PTE for the first page of the u-area; i.e. 123the user structure. Forget about the last 3 hex digits and compare the top 1245 to the ADDR column in the ps listing. 125 1262. Locating a process' user structure: 127 128Get the ADDR field of the desired process from the ps listing. This is the 129page frame number of the process' user structure. Tack 3 zeros on to the 130end to get the physical address. Note that this doesn't give you the kernel 131stack since it is in a different page than the user-structure and pages of 132the u-area are not physically contiguous. 133 1343. Locating a process' proc structure: 135 136First find the process' user structure as described above. Find the u_procp 137field at offset 0x200 from the beginning. This gives you the kernel VA of 138the proc structure. 139 1404. Locating a process' page table: 141 142First find the process' user structure as described above. The first part 143of the user structure is the PCB. The second longword (third field) of the 144PCB is pcb_ustp, a pointer to the user segment table. This pointer is 145actually the page frame number. Again adding 3 zeros yields the physical 146address. You can now use the values in the segment table to locate the 147page tables. For example, to locate the first page of the text/data part 148of the page table, use the first STE (longword) in the segment table. 149 1505. Locating a process' kernel stack: 151 152First find the process' page table as described above. The kernel stack 153is near the end of the user address space. So, locate the last entry in the 154user segment table (base+0xFFC) and use that entry to find the last page of 155the user page table. Look at the last 256 entries of this page 156(pagebase+0xFE0) The first is the PTE for the user-structure. The second 157was intended to be a read-only page to protect the user structure from the 158kernel stack. Currently it is read/write and actually allocated. Hence 159it can wind up being a second page for the kernel stack. The third is the 160kernel stack. The last 253 should be zero. Hence, indirecing through the 161third of these last 256 PTEs will give you the kernel stack page. 162 163An alternate way to do this is to use the p_addr field of the proc structure 164which is found as described above. The p_addr field is at offset 0x10 in the 165proc structure and points to the first of the PTEs mentioned above (i.e. the 166user structure PTE). 167 1686. Interpreting the info in a "trap type N..." panic: 169 170As mentioned, when the kernel crashes out of hp300/trap.c it will dump some 171useful information. This dates back to the days when I was debugging the 172exception handling code and had no kernel adb or even kernel crash dump code. 173"trap type" (decimal) is as defined in hp300/trap.h, it doesn't really 174correlate with anything useful. "code" (hex) is only useful for MMU 175(trap type 8) errors. It is the concatination of the MMU status register 176(see hp300/cpu.h) in the high 16 bits and the 68020 special status word 177(see the 020 manual page 6-17) in the low 16. "v" (hex) is the virtual 178address which caused the fault. "pid" (decimal) is the ID of the process 179running at the time of the exception. Note that if we panic in an interrupt 180routine, this process may not be related to the panic. "ps" (hex) is the 181value of the 68020 status register (see page 1-4 of 020 manual) at the time 182of the crash. If the 0x2000 bit is on, we were in supervisor (kernel) mode 183at the time, otherwise we were in user mode. "pc" (hex) is the value of the 184PC saved on the hardware exception frame. It may *not* be the PC of the 185instruction causing the fault (see the 020 manual for details). The 0x2000 186bit of "ps" dictates whether this is a kernel or user VA. "sfc" and "dfc" 187are the 68020 source/destination function codes. They should always be one. 188"p0" and "p1" are the VAX-like region registers. They are of the form: 189 190 <length> '@' <kernel VA> 191 192where both are in hex. Following these values are a dump of the processor 193registers (hex). Check the address registers for values close to "v", the 194fault address. Most faults are causes by dereferences of bogus pointers. 195Most such dereferences are the result of 020 instructions using the: 196 197 <address-register> '@' '(' offset ')' 198 199addressing mode. This can help you track down the faulting instruction (since 200the PC may not point to it). Note that the value of a7 (the stack pointer) is 201ALWAYS the user SP. This is brain-dead I know. Finally, is a dump of the 202stack (user/kernel) at the time of the offense. Before kernel crash dumps, 203this was very useful. 204 2057. Converting kernel virtual address to a physical address. 206 207Adb -k already does this for you, but sometimes you want to know what the 208resulting physical address is rather than what is there. Doing this is 209simply a matter of indexing into the kernel page table. In theory we would 210first have to do a lookup in the kernel segment table, but we know that the 211kernel page table is physically contiguous so this isn't necessary. The 212base of the system page table is "Sysmap", so to convert an address V just 213divide the address by 4096 to get the page number, multiply that by 4 (the 214size of a PTE in bytes) to get a byte offset, and add that to "Sysmap". 215This gives you the address of the PTE mapping V. You can then get the 216physical address by masking out the low 12 bits of the contents of that PTE. 217To wit: 218 219 *(Sysmap+(VA%1000*4))&fffff000 220 221where VA is the virtual address in question. 222 223This technique should also work for user virtual addresses if you replace 224"Sysmap" with the value of the appropriate processes' P0BR. This works 225because a user's page table is *virtually* contiguous in the kernel 226starting at P0BR, and adb will handle translating the kernel virtual addresses 227for you. 228