1\" @(#)postp.me 1.14 03/15/82 2.EQ 3delim ## 4gsize 11 5.EN 6.sh 1 "Post Processing" 7.pp 8Having gathered the arcs of the call graph and timing information 9for an execution of the program, 10we are interested in attributing the time for each routine to the 11routines that call it. 12We build a dynamic call graph with arcs from caller to callee, 13and propagate time from descendants to ancestors 14by topologically sorting the call graph. 15Time propagation is performed from the leaves of the 16call graph toward the roots, according to the order 17assigned by a topological numbering algorithm. 18The topological numbering ensures that 19all edges in the graph go from higher numbered nodes to lower 20numbered nodes. 21An example is given in Figure 1. 22If we propagate time from nodes in the 23order assigned by the algorithm, 24execution time can be propagated from descendants to ancestors 25after a single traversal of each arc in the call graph. 26Each parent receives some fraction of a child's time. 27Thus time is charged to the 28caller in addition to being charged to the callee. 29.(z C 30.TS 31center; 32c c c c c. 338 9 34 35 36 3 7 37 38 392 5 6 40 41 42 1 4 43 44 45.TE 46.ce 2 47Topological ordering 48Figure 1. 49.ce 0 50.)z 51.pp 52Let #C sub e# be the number of calls to some routine, 53#e#, and #C sub e sup r# be the number of 54calls from a caller #r# to a callee #e#. 55Since we are assuming each call to a routine takes the 56average amount of time for all calls to that routine, 57the caller is accountable for 58#C sub e sup r / C sub e# 59of the time spent by the callee. 60Let the #S sub e# be the #selftime# of a routine, #e#. 61The selftime of a routine can be determined from the 62timing information gathered during profiled program execution. 63The total time, #T sub r#, we wish to account to a routine 64#r#, is then given by the recurrence equation: 65.EQ 66T sub r ~ = ~ {S sub r} ~ + ~ 67 sum from {r ~ roman CALLS ~ e} 68 {T sub e times {{C sub e sup r} over {C sub e}}} 69.EN 70where #r ~ roman CALLS ~ e# is a relation showing all routines 71#e# called by a routine #r#. 72This relation is easily available from the call graph. 73.pp 74However, if the execution contains recursive calls, 75the call graph has cycles that 76cannot be topologically sorted. 77In these cases, we discover strongly-connected 78components in the call graph, 79treat each such component as a single node, 80and then sort the resulting graph. 81We use a variation of Tarjan's strongly-connected 82components algorithm 83that discovers strongly-connected components as it is assigning 84topological order numbers [Tarjan72]. 85.pp 86Time propagation within strongly connected 87components is a problem. 88For example, a self-recursive routine 89(a trivial cycle in the call graph) 90is accountable for all the time it 91uses in all its recursive instantiations. 92In our scheme, this time should be 93shared among its call graph parents. 94The arcs from a routine to itself are of interest, 95but do not participate in time propagation. 96Thus the simple equation for time propagation 97does not work within strongly connected components. 98Time is not propagated from one member of a cycle to another, 99since, by definition, this involves propagating time from a routine 100to itself. 101In addition, children of one member of a cycle 102must be considered children of all members of the cycle. 103Similarly, parents of one member of the cycle must inherit 104all members of the cycle as descendants. 105It is for these reasons that we collapse connected components. 106Our solution collects all members of a cycle together, 107summing the time and call counts for all members. 108All calls into the cycle are made to share the total 109time of the cycle, and all descendants of the cycle 110propagate time into the cycle as a whole. 111Calls among the members of the cycle 112do not propagate any time, 113though they are listed in the call graph profile. 114.pp 115Figure 2 shows a modified version of the call graph of Figure 1, 116in which the nodes labelled 3 and 7 in Figure 1 are mutually 117recursive. 118The topologically sorted graph after the cycle is collapsed is 119given in Figure 3. 120.(z C 121.TS 122center; 123c c c c c. 124o o 125 126 127 \(bu \(bu 128 129 130o o o 131 132 133 o o 134 135 136.TE 137.ce 2 138Cycle to be collapsed. 139Figure 2. 140.ce 0 141.)z 142.(z C 143.TS 144center; 145c c c c c. 1467 8 147 148 149 \fI6\fP \fI6\fP 150 151 1522 4 5 153 154 155 1 3 156 157 158.TE 159.ce 2 160Topological numbering after cycle collapsing. 161Figure 3. 162.ce 0 163.)z 164.pp 165Since the technique described above only collects the 166dynamic call graph, 167and the program typically does not call every routine 168on each execution, 169different executions can introduce different cycles in the 170dynamic call graph. 171Since cycles often have a significant effect on time propagation, 172it is desirable to incorporate the static call graph so that cycles 173will have the same members regardless of how the program runs. 174.pp 175The static call graph can be constructed from the source text 176of the program. 177However, discovering the static call graph from the source text 178would require two moderately difficult steps: 179finding the source text for the program 180(which may not be available), 181and scanning and parsing that text, 182which may be in any one of several languages. 183.pp 184In our programming system, 185the static calling information is also contained in the 186executable version of the program, 187which we already have available, 188and which is in language-independent form. 189One can examine the instructions 190in the object program, 191looking for calls to routines, and note which 192routines can be called. 193This technique allows us to add arcs to those already in the 194dynamic call graph. 195If a statically discovered arc already exists in the dynamic call 196graph, no action is required. 197Statically discovered arcs that do not exist in the dynamic call 198graph are added to the graph with a traversal count of zero. 199Thus they are never responsible for any time propagation. 200However, they may affect the structure of the graph. 201Since they may complete strongly connected components, 202the static call graph construction is 203done before topological ordering. 204