xref: /original-bsd/sys/hp300/DOC/Pmap.notes (revision 3705696b)
1Following are some observations about the the BSD hp300 pmap module that
2may prove useful for other pmap modules:
3
41. pmap_remove should be efficient with large, sparsely populated ranges.
5
6   Profiling of exec/exit intensive work loads showed that much time was
7   being spent in pmap_remove.  This was primarily due to calls from exec
8   when deallocating the stack segment.  Since the current implementation
9   of the stack is to "lazy allocate" the maximum possible stack size
10   (typically 16-32mb) when the process is created, pmap_remove will be
11   called with a large chunk of largely empty address space.  It is
12   important that this routine be able to quickly skip over large chunks
13   of allocated but unpopulated VA space.  The hp300 pmap module did check
14   for unpopulated "segments" (which map 4mb chunks) and skipped them fairly
15   efficiently but once it found a valid segment descriptor (STE), it rather
16   clumsily moved forward over the PTEs mapping that segment.  Particularly
17   bad was that for every PTE it would recheck that the STE was valid even
18   though we should already know that.
19
20   pmap_protect can benefit from similar optimizations though it is
21   (currently) not called with large regions.
22
23   Another solution would be to change the way stack allocation is done
24   (i.e. don't preallocate the entire address range) but I think it is
25   important to be able to efficiently support such large, spare ranges
26   that might show up in other applications (e.g. a randomly accessed
27   large mapped file).
28
292. Bit operations (i.e. ~,&,|) are more efficient than bitfields.
30
31   This is a 68k/gcc issue, but if you are trying to squeeze out maximum
32   performance...
33
343. Don't flush TLB/caches for inactive mappings.
35
36   On the hp300 the TLBs are either designed as, or used in such a way that,
37   they are flushed on every context switch (i.e. there are no "process
38   tags")  Hence, doing TLB flushes on mappings that aren't associated with
39   either the kernel or the currently running process are a waste.  Seems
40   pretty obvious but I missed it for many years.  An analogous argument
41   applies to flushing untagged virtually addressed caches (ala the 320/350).
42