xref: /original-bsd/usr.bin/gprof/PSD.doc/postp.me (revision 8208c1e2)
1\"	@(#)postp.me	1.14 03/15/82
2.EQ
3delim ##
4gsize 11
5.EN
6.sh 1 "Post Processing"
7.pp
8Having gathered the arcs of the call graph and timing information
9for an execution of the program,
10we are interested in attributing the time for each routine to the
11routines that call it.
12We build a dynamic call graph with arcs from caller to callee,
13and propagate time from descendants to ancestors
14by topologically sorting the call graph.
15Time propagation is performed from the leaves of the
16call graph toward the roots, according to the order
17assigned by a topological numbering algorithm.
18The topological numbering ensures that
19all edges in the graph go from higher numbered nodes to lower
20numbered nodes.
21An example is given in Figure 1.
22If we propagate time from nodes in the
23order assigned by the algorithm,
24execution time can be propagated from descendants to ancestors
25after a single traversal of each arc in the call graph.
26Each parent receives some fraction of a child's time.
27Thus time is charged to the
28caller in addition to being charged to the callee.
29.(z C
30.TS
31center;
32c c c c c.
338				9
34
35
36	3		7
37
38
392		5		6
40
41
42	1		4
43
44
45.TE
46.ce 2
47Topological ordering
48Figure 1.
49.ce 0
50.)z
51.pp
52Let #C sub e# be the number of calls to some routine,
53#e#, and #C sub e sup r# be the number of
54calls from a caller #r# to a callee #e#.
55Since we are assuming each call to a routine takes the
56average amount of time for all calls to that routine,
57the caller is accountable for
58#C sub e sup r / C sub e#
59of the time spent by the callee.
60Let the #S sub e# be the #selftime# of a routine, #e#.
61The selftime of a routine can be determined from the
62timing information gathered during profiled program execution.
63The total time, #T sub r#, we wish to account to a routine
64#r#, is then given by the recurrence equation:
65.EQ
66T sub r ~ = ~ {S sub r} ~ + ~
67                   sum from {r ~ roman CALLS ~ e}
68                   {T sub e times {{C sub e sup r} over {C sub e}}}
69.EN
70where #r ~ roman CALLS ~ e# is a relation showing all routines
71#e# called by a routine #r#.
72This relation is easily available from the call graph.
73.pp
74However, if the execution contains recursive calls,
75the call graph has cycles that
76cannot be topologically sorted.
77In these cases, we discover strongly-connected
78components in the call graph,
79treat each such component as a single node,
80and then sort the resulting graph.
81We use a variation of Tarjan's strongly-connected
82components algorithm
83that discovers strongly-connected components as it is assigning
84topological order numbers [Tarjan72].
85.pp
86Time propagation within strongly connected
87components is a problem.
88For example, a self-recursive routine
89(a trivial cycle in the call graph)
90is accountable for all the time it
91uses in all its recursive instantiations.
92In our scheme, this time should be
93shared among its call graph parents.
94The arcs from a routine to itself are of interest,
95but do not participate in time propagation.
96Thus the simple equation for time propagation
97does not work within strongly connected components.
98Time is not propagated from one member of a cycle to another,
99since, by definition, this involves propagating time from a routine
100to itself.
101In addition, children of one member of a cycle
102must be considered children of all members of the cycle.
103Similarly, parents of one member of the cycle must inherit
104all members of the cycle as descendants.
105It is for these reasons that we collapse connected components.
106Our solution collects all members of a cycle together,
107summing the time and call counts for all members.
108All calls into the cycle are made to share the total
109time of the cycle, and all descendants of the cycle
110propagate time into the cycle as a whole.
111Calls among the members of the cycle
112do not propagate any time,
113though they are listed in the call graph profile.
114.pp
115Figure 2 shows a modified version of the call graph of Figure 1,
116in which the nodes labelled 3 and 7 in Figure 1 are mutually
117recursive.
118The topologically sorted graph after the cycle is collapsed is
119given in Figure 3.
120.(z C
121.TS
122center;
123c c c c c.
124o				o
125
126
127	\(bu		\(bu
128
129
130o		o		o
131
132
133	o		o
134
135
136.TE
137.ce 2
138Cycle to be collapsed.
139Figure 2.
140.ce 0
141.)z
142.(z C
143.TS
144center;
145c c c c c.
1467				8
147
148
149	\fI6\fP		\fI6\fP
150
151
1522		4		5
153
154
155	1		3
156
157
158.TE
159.ce 2
160Topological numbering after cycle collapsing.
161Figure 3.
162.ce 0
163.)z
164.pp
165Since the technique described above only collects the
166dynamic call graph,
167and the program typically does not call every routine
168on each execution,
169different executions can introduce different cycles in the
170dynamic call graph.
171Since cycles often have a significant effect on time propagation,
172it is desirable to incorporate the static call graph so that cycles
173will have the same members regardless of how the program runs.
174.pp
175The static call graph can be constructed from the source text
176of the program.
177However, discovering the static call graph from the source text
178would require two moderately difficult steps:
179finding the source text for the program
180(which may not be available),
181and scanning and parsing that text,
182which may be in any one of several languages.
183.pp
184In our programming system,
185the static calling information is also contained in the
186executable version of the program,
187which we already have available,
188and which is in language-independent form.
189One can examine the instructions
190in the object program,
191looking for calls to routines, and note which
192routines can be called.
193This technique allows us to add arcs to those already in the
194dynamic call graph.
195If a statically discovered arc already exists in the dynamic call
196graph, no action is required.
197Statically discovered arcs that do not exist in the dynamic call
198graph are added to the graph with a traversal count of zero.
199Thus they are never responsible for any time propagation.
200However, they may affect the structure of the graph.
201Since they may complete strongly connected components,
202the static call graph construction is
203done before topological ordering.
204