1\" @(#)postp.me 1.7 03/12/82 2.EQ 3delim ## 4gsize 12 5.EN 6.sh 1 "Post Processing" 7.pp 8Having gathered the arcs of the call graph and timing information 9for an execution of the program, 10we are interested in attributing the time for each routine to the 11routines that call it. 12We construct a dynamic call graph with arcs from caller to callee, 13and propagate time from descendants to ancestors 14by topologically sorting the call graph. 15Time propagation is performed from the leaves of the 16call graph toward the roots, according to the order 17assigned by a topological numbering algorithm. 18The topological numbering ensures that 19all edges in the graph go from higher numbered nodes to lower 20numbered nodes. 21If we propagate time from nodes in the 22reverse order assigned by the algorithm, 23execution time can be propagated from descendants to ancestors 24after a single traversal of each arc in the call graph. 25Each parent receives some fraction of a child's time. 26Thus time is charged to the 27caller in addition to being charged to the callee. 28.(b 29.TS 30center; 31c c c c c. 328 9 33 34 35 3 7 36 37 382 5 6 39 40 41 1 4 42 43 44.TE 45.ce 1 46Topological ordering 47.ce 0 48.)b 49.pp 50Let #C sub e# be the number of calls to some routine, 51#e#, and #C sub e sup r# be the number of 52calls from a caller #r# to a callee #e#. 53Since we are assuming each call to a routine takes the 54average amount of time for all calls to that routine, 55the caller is accountable for 56#C sub e sup r# \(sl #C sub e# 57of the time spent by the callee. 58Let the #S sub e# be the #selftime# of a routine, #e#. 59The selftime of a routine can be determined from the 60timing information gathered during profiled program execution. 61The total time, #T sub r#, we wish to account to a routine 62#r#, is then given by the recurence equation: 63.EQ 64T sub r ~ = ~ {S sub r} ~ + ~ 65 sum from {r ~ roman CALLS ~ e} 66 {T sub e \(mu {{C sub e sup r} over {C sub e}}} 67.EN 68where #r ~ roman CALLS ~ e# is a relation indicating all routines 69#e# called by a routine #r#. 70This relation is easily available from the call graph. 71.pp 72However, if the execution contains recursive calls, 73the call graph has cycles that 74cannot be topologically sorted. 75In these cases, we discover strongly-connected 76components in the call graph, 77treat each such component as a single node, 78and then sort the resulting graph. 79We use a variation of Tarjan's strongly-connected 80components algorithm 81that discovers strongly-connected components as it is assigning 82topological order numbers [Tarjan72]. 83.pp 84Time propagation within strongly connected 85components is a problem. 86For example, a self-recursive routine 87(a trivial cycle in the call graph) 88is accountable for all the time it 89uses in all its recursive instantiations. 90In our scheme, this time should be 91shared among its call graph parents. 92The arcs from a routine to itself are of interest, 93but do not participate in time propagation. 94Thus the simple equation for time propagation 95does not work within strongly connected components. 96Time is not propagated from one member of a cycle to another, 97since, by definition, this involves propagating time from a routine 98to itself. 99In addition, children of one member of a cycle 100must be considered children of all members of the cycle. 101Similarly, parents of one member of the cycle must inherit 102all members of the cycle as descendants. 103It is for these reasons that we collapse connected components. 104Our solution collects all members of a cycle together, 105summing the time and call counts for all members. 106All calls into the cycle are made to share the total 107time of the cycle, and all descendants of the cycle 108propagate time into the cycle as a whole. 109Calls among the members of the cycle 110do not propagate any time, 111though they are listed in the call graph profile. 112.(b 113.TS 114center; 115c c c c c. 116o o 117 118 119 o o 120 121 122o o o 123 124 125 o o 126 127 128.TE 129.ce 1 130Cycle to be collapsed. 131.ce 0 132.)b 133.(b 134.TS 135center; 136c c c c c. 1377 8 138 139 140 6 6 141 142 1432 4 5 144 145 146 1 3 147 148 149.TE 150.ce 1 151Topological numbering after cycle collapsing. 152.ce 0 153.)b 154.pp 155Since the technique described above only collects the 156dynamic call graph, 157and the program typically does not call every routine 158on each execution, 159different executions can introduce different cycles in the 160dynamic call graph. 161Since cycles often have a significant effect on time propagation, 162it is desirable to incorporate the static call graph so that cycles 163will have the same members regardless of how the program runs. 164.pp 165The static call graph can be constructed from the source text 166of the program. 167However, discovering the static call graph from the source text 168would require two moderately difficult steps: 169finding the source text for the program 170(which may not be available), 171and scanning and parsing that text, 172which may be in any one of a number of languages. 173In our programming system, 174the static calling information is also contained in the 175executable version of the program, 176which we already have available, 177and which is in language-independent form. 178One can examine the instructions 179in the object program, 180looking for calls to routines, and note which 181routines can be called. 182This technique allows us to add arcs to those already in the 183dynamic call graph. 184If a statically discovered arc already exists in the dynamic call 185graph, no action is required. 186Statically discovered arcs that do not exist in the dynamic call 187graph are added to the graph with a traversal count of zero. 188Thus they are never responsible for any time propagation. 189However, they may affect the structure of the graph. 190Since they may complete strongly connected components, 191the static call graph construction is 192done before topological ordering. 193