1*09467b48SpatrickDate: Fri, 1 Jun 2001 17:08:44 -0500 (CDT) 2*09467b48SpatrickFrom: Chris Lattner <sabre@nondot.org> 3*09467b48SpatrickTo: Vikram S. Adve <vadve@cs.uiuc.edu> 4*09467b48SpatrickSubject: RE: Interesting: GCC passes 5*09467b48Spatrick 6*09467b48Spatrick> That is very interesting. I agree that some of these could be done on LLVM 7*09467b48Spatrick> at link-time, but it is the extra time required that concerns me. Link-time 8*09467b48Spatrick> optimization is severely time-constrained. 9*09467b48Spatrick 10*09467b48SpatrickIf we were to reimplement any of these optimizations, I assume that we 11*09467b48Spatrickcould do them a translation unit at a time, just as GCC does now. This 12*09467b48Spatrickwould lead to a pipeline like this: 13*09467b48Spatrick 14*09467b48SpatrickStatic optimizations, xlation unit at a time: 15*09467b48Spatrick.c --GCC--> .llvm --llvmopt--> .llvm 16*09467b48Spatrick 17*09467b48SpatrickLink time optimizations: 18*09467b48Spatrick.llvm --llvm-ld--> .llvm --llvm-link-opt--> .llvm 19*09467b48Spatrick 20*09467b48SpatrickOf course, many optimizations could be shared between llvmopt and 21*09467b48Spatrickllvm-link-opt, but the wouldn't need to be shared... Thus compile time 22*09467b48Spatrickcould be faster, because we are using a "smarter" IR (SSA based). 23*09467b48Spatrick 24*09467b48Spatrick> BTW, about SGI, "borrowing" SSA-based optimizations from one compiler and 25*09467b48Spatrick> putting it into another is not necessarily easier than re-doing it. 26*09467b48Spatrick> Optimization code is usually heavily tied in to the specific IR they use. 27*09467b48Spatrick 28*09467b48SpatrickUnderstood. The only reason that I brought this up is because SGI's IR is 29*09467b48Spatrickmore similar to LLVM than it is different in many respects (SSA based, 30*09467b48Spatrickrelatively low level, etc), and could be easily adapted. Also their 31*09467b48Spatrickoptimizations are written in C++ and are actually somewhat 32*09467b48Spatrickstructured... of course it would be no walk in the park, but it would be 33*09467b48Spatrickmuch less time consuming to adapt, say, SSA-PRE than to rewrite it. 34*09467b48Spatrick 35*09467b48Spatrick> But your larger point is valid that adding SSA based optimizations is 36*09467b48Spatrick> feasible and should be fun. (Again, link time cost is the issue.) 37*09467b48Spatrick 38*09467b48SpatrickAssuming linktime cost wasn't an issue, the question is: 39*09467b48SpatrickDoes using GCC's backend buy us anything? 40*09467b48Spatrick 41*09467b48Spatrick> It also occurs to me that GCC is probably doing quite a bit of back-end 42*09467b48Spatrick> optimization (step 16 in your list). Do you have a breakdown of that? 43*09467b48Spatrick 44*09467b48SpatrickNot really. The irritating part of GCC is that it mixes it all up and 45*09467b48Spatrickdoesn't have a clean separation of concerns. A lot of the "back end 46*09467b48Spatrickoptimization" happens right along with other data optimizations (ie, CSE 47*09467b48Spatrickof machine specific things). 48*09467b48Spatrick 49*09467b48SpatrickAs far as REAL back end optimizations go, it looks something like this: 50*09467b48Spatrick 51*09467b48Spatrick1. Instruction combination: try to make CISCy instructions, if available 52*09467b48Spatrick2. Register movement: try to get registers in the right places for the 53*09467b48Spatrickarchitecture to avoid register to register moves. For example, try to get 54*09467b48Spatrickthe first argument of a function to naturally land in %o0 for sparc. 55*09467b48Spatrick3. Instruction scheduling: 'nuff said :) 56*09467b48Spatrick4. Register class preferencing: ?? 57*09467b48Spatrick5. Local register allocation 58*09467b48Spatrick6. global register allocation 59*09467b48Spatrick7. Spilling 60*09467b48Spatrick8. Local regalloc 61*09467b48Spatrick9. Jump optimization 62*09467b48Spatrick10. Delay slot scheduling 63*09467b48Spatrick11. Branch shorting for CISC machines 64*09467b48Spatrick12. Instruction selection & peephole optimization 65*09467b48Spatrick13. Debug info output 66*09467b48Spatrick 67*09467b48SpatrickBut none of this would be usable for LLVM anyways, unless we were using 68*09467b48SpatrickGCC as a static compiler. 69*09467b48Spatrick 70*09467b48Spatrick-Chris 71*09467b48Spatrick 72