1Following are some observations about the the BSD hp300 pmap module that 2may prove useful for other pmap modules: 3 41. pmap_remove should be efficient with large, sparsely populated ranges. 5 6 Profiling of exec/exit intensive work loads showed that much time was 7 being spent in pmap_remove. This was primarily due to calls from exec 8 when deallocating the stack segment. Since the current implementation 9 of the stack is to "lazy allocate" the maximum possible stack size 10 (typically 16-32mb) when the process is created, pmap_remove will be 11 called with a large chunk of largely empty address space. It is 12 important that this routine be able to quickly skip over large chunks 13 of allocated but unpopulated VA space. The hp300 pmap module did check 14 for unpopulated "segments" (which map 4mb chunks) and skipped them fairly 15 efficiently but once it found a valid segment descriptor (STE), it rather 16 clumsily moved forward over the PTEs mapping that segment. Particularly 17 bad was that for every PTE it would recheck that the STE was valid even 18 though we should already know that. 19 20 pmap_protect can benefit from similar optimizations though it is 21 (currently) not called with large regions. 22 23 Another solution would be to change the way stack allocation is done 24 (i.e. don't preallocate the entire address range) but I think it is 25 important to be able to efficiently support such large, spare ranges 26 that might show up in other applications (e.g. a randomly accessed 27 large mapped file). 28 292. Bit operations (i.e. ~,&,|) are more efficient than bitfields. 30 31 This is a 68k/gcc issue, but if you are trying to squeeze out maximum 32 performance... 33 343. Don't flush TLB/caches for inactive mappings. 35 36 On the hp300 the TLBs are either designed as, or used in such a way that, 37 they are flushed on every context switch (i.e. there are no "process 38 tags") Hence, doing TLB flushes on mappings that aren't associated with 39 either the kernel or the currently running process are a waste. Seems 40 pretty obvious but I missed it for many years. An analogous argument 41 applies to flushing untagged virtually addressed caches (ala the 320/350). 42