1e6ec719fShibler1. Create and use an interrupt stack. 2e6ec719fShibler Well actually, use the master SP for kernel stacks instead of 3e6ec719fShibler the interrupt SP. Right now we use the interrupt stack for 4e575b9caShibler everything. Allows for more accurate accounting of systime. 5e575b9caShibler In theory, could also allow for smaller kernel stacks but we 6e575b9caShibler only use one page anyway. 7e6ec719fShibler 8e6ec719fShibler2. Copy/clear primitives could be tuned. 967720516Smckusick What is best is highly CPU and cache dependent. One thing to look 1067720516Smckusick at are the copyin/copyout primitives. Rather than looping using 1167720516Smckusick MOVS instructions, you could map an entire page at a time and use 1267720516Smckusick bcopy, MOVE16, or whatever. This would lose big on the VAC models 1367720516Smckusick however. 14e6ec719fShibler 15e6ec719fShibler3. Sendsig/sigreturn are pretty bogus. 16e6ec719fShibler Currently we can call a signal handler even if an excpetion 17e6ec719fShibler occurs in the middle of an instruction. This causes the handler 18e6ec719fShibler to return right back to the middle of the offending instruction 19e6ec719fShibler which will most likely lead to another exception/signal. 20e6ec719fShibler Technically, I feel this is the correct behavior but it requires 21e6ec719fShibler saving a lot of state on the user's stack, state that we don't 22e6ec719fShibler really want the user messing with. Other 68k implementations 23e6ec719fShibler (e.g. Sun) will delay signals or abort execution of the current 24e6ec719fShibler instruction to reduce saved state. Even if we stick with the 25e6ec719fShibler current philosophy, the code could be cleaned up. 26e6ec719fShibler 27e6ec719fShibler4. Ditto for AST and software interrupt emulation. 28e6ec719fShibler Both are possibly over-elaborate and inefficiently implemented. 29e6ec719fShibler We could possibly handle them by using an appropriately planted 30e6ec719fShibler PS trace bit. 31e6ec719fShibler 32e575b9caShibler5. Make use of transparent translation registers on 030/040 MMU. 33e6ec719fShibler With a little rearranging of the KVA space we could use one to 34e6ec719fShibler map the entire external IO space [ 600000 - 20000000 ). Since 35e6ec719fShibler the translation must be 1-1, this would limit the kernel to 6mb 36e6ec719fShibler (some would say that is hardly a limit) or divide it into two 37e575b9caShibler pieces. Another promising use would be to map physical memory 38e575b9caShibler within the kernel. This allows a much simpler and more efficient 39e575b9caShibler implementation of /dev/mem, pmap_zero_page, pmap_copy_page and 40e575b9caShibler possible even kernel-user cross address space copies. However, 41e575b9caShibler it does eat up a significant piece of kernel address space. 42e6ec719fShibler 4367720516Smckusick6. Create a 32-bit timer. 4467720516Smckusick Timers 2 and 3 on the MC6840 clock chip can be concatonated together to 4567720516Smckusick get a 32-bit countdown timer. There are at least three uses for this: 4667720516Smckusick 1. Monitoring the interval timer ("clock") to detect lost "ticks". 4767720516Smckusick (Idea from Scott Marovich) 4867720516Smckusick 2. Implement the DELAY macro properly instead of approximating with 4967720516Smckusick the current "while (--count);" loop. Because of caches, the current 5067720516Smckusick method is potentially way off. 5167720516Smckusick 3. Export as a user-mappable timer for high-precision (4us) timing. 5267720516Smckusick Note that by doing this we can no longer use timer 3 as a separate 5367720516Smckusick statistics/profiling timer. Should be able to compile-time (runtime?) 5467720516Smckusick select between the two. 55e6ec719fShibler 56e6ec719fShibler7. Conditional MMU code sould be restructured. 57e6ec719fShibler Right now it reflects the evolutionary path of the code: 320/350 MMU 58e6ec719fShibler was supported and PMMU support was glued on. The latter can be ifdef'ed 59e6ec719fShibler out when not needed, but not all of the former (e.g. ``mmutype'' tests). 60e6ec719fShibler Also, PMMU is made to look like the HP MMU somewhat ham-stringing it. 61e6ec719fShibler Since HP MMU models are dead, the excess baggage should be there (though 62e6ec719fShibler it could be argued that they benefit more from the minor performance 63e6ec719fShibler impact). MMU code should probably not be ifdef'ed on model type, but 64e6ec719fShibler rather on more relevant tags (e.g. MMU_HP, MMU_MOTO). 65e6ec719fShibler 6667720516Smckusick8. Redo cache handling. 67fa58284cShibler There are way too many routines which are specific to particular 68fa58284cShibler cache types. We should be able to come up with a more coherent 69fa58284cShibler scheme (though HP 68k boxes have just about every caching scheme 70fa58284cShibler imaginable: internal/external, physical/virtual, writeback/writethrough) 71*81e3dc72Shibler See, for example, Wheeler and Bershad in ASPLOS 92. For more efficient 72*81e3dc72Shibler handling of physical caches see also Kessler and Hill in Nov. 92 TOCS. 7332bac732Shibler 74816ab3d7Shibler9. Sort the free page list. 75816ab3d7Shibler The DMA hardware on the 300 cannot do scatter/gather IO. For example, 76816ab3d7Shibler if an 8k system buffer consists of two non-contiguous physical pages 77816ab3d7Shibler it will require two DMA transfers (and hence two interrupts) to do the 78816ab3d7Shibler operation. It would take only one transfer if they were physically 79816ab3d7Shibler contiguous. By keeping the free list ordered we could potentially 80816ab3d7Shibler allocate contiguous pages and reduce the number of interrupts. We can 81816ab3d7Shibler consider doing this since pages in the free list are not reclaimed and 82816ab3d7Shibler thus we don't have to worry about distorting any LRU behavior. 8332bac732Shibler---- 8432bac732ShiblerMike Hibler 8532bac732ShiblerUniversity of Utah CSS group 8632bac732Shiblermike@cs.utah.edu 87