1*09467b48SpatrickDate: Fri, 1 Jun 2001 17:08:44 -0500 (CDT)
2*09467b48SpatrickFrom: Chris Lattner <sabre@nondot.org>
3*09467b48SpatrickTo: Vikram S. Adve <vadve@cs.uiuc.edu>
4*09467b48SpatrickSubject: RE: Interesting: GCC passes
5*09467b48Spatrick
6*09467b48Spatrick> That is very interesting.  I agree that some of these could be done on LLVM
7*09467b48Spatrick> at link-time, but it is the extra time required that concerns me.  Link-time
8*09467b48Spatrick> optimization is severely time-constrained.
9*09467b48Spatrick
10*09467b48SpatrickIf we were to reimplement any of these optimizations, I assume that we
11*09467b48Spatrickcould do them a translation unit at a time, just as GCC does now.  This
12*09467b48Spatrickwould lead to a pipeline like this:
13*09467b48Spatrick
14*09467b48SpatrickStatic optimizations, xlation unit at a time:
15*09467b48Spatrick.c --GCC--> .llvm --llvmopt--> .llvm
16*09467b48Spatrick
17*09467b48SpatrickLink time optimizations:
18*09467b48Spatrick.llvm --llvm-ld--> .llvm --llvm-link-opt--> .llvm
19*09467b48Spatrick
20*09467b48SpatrickOf course, many optimizations could be shared between llvmopt and
21*09467b48Spatrickllvm-link-opt, but the wouldn't need to be shared...  Thus compile time
22*09467b48Spatrickcould be faster, because we are using a "smarter" IR (SSA based).
23*09467b48Spatrick
24*09467b48Spatrick> BTW, about SGI, "borrowing" SSA-based optimizations from one compiler and
25*09467b48Spatrick> putting it into another is not necessarily easier than re-doing it.
26*09467b48Spatrick> Optimization code is usually heavily tied in to the specific IR they use.
27*09467b48Spatrick
28*09467b48SpatrickUnderstood.  The only reason that I brought this up is because SGI's IR is
29*09467b48Spatrickmore similar to LLVM than it is different in many respects (SSA based,
30*09467b48Spatrickrelatively low level, etc), and could be easily adapted.  Also their
31*09467b48Spatrickoptimizations are written in C++ and are actually somewhat
32*09467b48Spatrickstructured... of course it would be no walk in the park, but it would be
33*09467b48Spatrickmuch less time consuming to adapt, say, SSA-PRE than to rewrite it.
34*09467b48Spatrick
35*09467b48Spatrick> But your larger point is valid that adding SSA based optimizations is
36*09467b48Spatrick> feasible and should be fun.  (Again, link time cost is the issue.)
37*09467b48Spatrick
38*09467b48SpatrickAssuming linktime cost wasn't an issue, the question is:
39*09467b48SpatrickDoes using GCC's backend buy us anything?
40*09467b48Spatrick
41*09467b48Spatrick> It also occurs to me that GCC is probably doing quite a bit of back-end
42*09467b48Spatrick> optimization (step 16 in your list).  Do you have a breakdown of that?
43*09467b48Spatrick
44*09467b48SpatrickNot really.  The irritating part of GCC is that it mixes it all up and
45*09467b48Spatrickdoesn't have a clean separation of concerns.  A lot of the "back end
46*09467b48Spatrickoptimization" happens right along with other data optimizations (ie, CSE
47*09467b48Spatrickof machine specific things).
48*09467b48Spatrick
49*09467b48SpatrickAs far as REAL back end optimizations go, it looks something like this:
50*09467b48Spatrick
51*09467b48Spatrick1. Instruction combination: try to make CISCy instructions, if available
52*09467b48Spatrick2. Register movement: try to get registers in the right places for the
53*09467b48Spatrickarchitecture to avoid register to register moves.  For example, try to get
54*09467b48Spatrickthe first argument of a function to naturally land in %o0 for sparc.
55*09467b48Spatrick3. Instruction scheduling: 'nuff said :)
56*09467b48Spatrick4. Register class preferencing: ??
57*09467b48Spatrick5. Local register allocation
58*09467b48Spatrick6. global register allocation
59*09467b48Spatrick7. Spilling
60*09467b48Spatrick8. Local regalloc
61*09467b48Spatrick9. Jump optimization
62*09467b48Spatrick10. Delay slot scheduling
63*09467b48Spatrick11. Branch shorting for CISC machines
64*09467b48Spatrick12. Instruction selection & peephole optimization
65*09467b48Spatrick13. Debug info output
66*09467b48Spatrick
67*09467b48SpatrickBut none of this would be usable for LLVM anyways, unless we were using
68*09467b48SpatrickGCC as a static compiler.
69*09467b48Spatrick
70*09467b48Spatrick-Chris
71*09467b48Spatrick
72