Following are some observations about the the BSD hp300 pmap module that may prove useful for other pmap modules: 1. pmap_remove should be efficient with large, sparsely populated ranges. Profiling of exec/exit intensive work loads showed that much time was being spent in pmap_remove. This was primarily due to calls from exec when deallocating the stack segment. Since the current implementation of the stack is to "lazy allocate" the maximum possible stack size (typically 16-32mb) when the process is created, pmap_remove will be called with a large chunk of largely empty address space. It is important that this routine be able to quickly skip over large chunks of allocated but unpopulated VA space. The hp300 pmap module did check for unpopulated "segments" (which map 4mb chunks) and skipped them fairly efficiently but once it found a valid segment descriptor (STE), it rather clumsily moved forward over the PTEs mapping that segment. Particularly bad was that for every PTE it would recheck that the STE was valid even though we should already know that. pmap_protect can benefit from similar optimizations though it is (currently) not called with large regions. Another solution would be to change the way stack allocation is done (i.e. don't preallocate the entire address range) but I think it is important to be able to efficiently support such large, spare ranges that might show up in other applications (e.g. a randomly accessed large mapped file). 2. Bit operations (i.e. ~,&,|) are more efficient than bitfields. This is a 68k/gcc issue, but if you are trying to squeeze out maximum performance... 3. Don't flush TLB/caches for inactive mappings. On the hp300 the TLBs are either designed as, or used in such a way that, they are flushed on every context switch (i.e. there are no "process tags") Hence, doing TLB flushes on mappings that aren't associated with either the kernel or the currently running process are a waste. Seems pretty obvious but I missed it for many years. An analogous argument applies to flushing untagged virtually addressed caches (ala the 320/350).