• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..26-Jul-2019-

READMEH A D26-Jul-20199.3 KiB212156

absparc.modH A D26-Jul-201920.5 KiB538466

absparc.sigH A D26-Jul-20191.7 KiB5739

bobabsyn.modH A D26-Jul-20193.7 KiB9785

bobabsyn.sigH A D26-Jul-20193.1 KiB9573

bobcatgram.modH A D26-Jul-201910.7 KiB279242

bobcatgram.sigH A D26-Jul-20192 KiB6047

bobparser.modH A D26-Jul-201972.8 KiB598594

bobparser.sigH A D26-Jul-201950 42

comp.tH A D26-Jul-20192.2 KiB10369

kitty.modH A D26-Jul-20199.3 KiB283220

kitty.sigH A D26-Jul-20191.6 KiB5238

lambdayacc.modH A D26-Jul-201924.3 KiB640527

lambdayacc.sigH A D26-Jul-20194.4 KiB128106

main.omH A D26-Jul-201916 11

realsparc.modH A D26-Jul-20194.5 KiB144128

realsparc.sigH A D26-Jul-2019174 136

test2.bobH A D26-Jul-2019420 2015

test3.bobH A D26-Jul-2019999 5238

test3.s-expH A D26-Jul-20193 KiB260255

test4.bobH A D26-Jul-2019256 2014

test4.s-expH A D26-Jul-2019837 8378

test5.bobH A D26-Jul-2019432 2724

test5.s-expH A D26-Jul-20191.9 KiB152147

test6.bobH A D26-Jul-2019234 1712

test6.s-expH A D26-Jul-2019548 4742

test7.bobH A D26-Jul-2019693 4636

test7.s-expH A D26-Jul-20192 KiB169164

test8.bobH A D26-Jul-2019165 128

typecat.modH A D26-Jul-20193.7 KiB10688

typecat.sigH A D26-Jul-2019171 106

README

1Informal Notes
2August 8th, 2001
3Chuck Liang
4
5
6This directory contains all the files for an experimental compiler
7written in Teyjus Lambda Prolog (please use most recent version -
8b31).  The source language being compiled is called "bobcat", because
9it syntactically resembles the "tiger" language described in Andrew
10Appel's compiler text.  Bobcat however, has a rudimentary form of C++
11style object orientation.  The target language is Sparc assembly code.
12Executables are created using cc (or gcc) in its role as an assembler.
13
14  The purpose of this project is to demonstrate and study the use of
15a higher order logic programming language in compiler construction.
16Optimizing compilers take years to develop, and it's not my purpose to
17perfect a product.  I want to develop a set of techniques, that
18combined with previous work by various researchers in higher order
19specification, can be used in realistic compilation.  For this purpose
20I've chosen to compile a very typical, imperative style language.
21I've also decided to forgo simple abstract machines in favor of a real
22RISC architecture, namely Sparc.
23
24The get a sense of the bobcat language being compiled, look at
25"test3.bob" and "test7.bob".  The generated assembly codes are in
26"test3.s" and "test7.s" respectively.  Type "gcc test3.s" or "gcc
27test7.s" on a Sparc machine to generate a normal executable a.out,
28then type "./a.out" to run the program.  (Use "cc" if gcc is not
29available).  Using the C linker has the beneficial effect of allowing
30C functions (printf) to be called from bobcat programs. :)
31
32
33**
34Some of the code presented here slightly varies from those presented
35in the paper "Compiler Construction in Higher Order Logic Programming."
36The compiler is meant to be experimental and will undergo refinements
37and even major overhauls over time.
38**
39
40
41The structure of the modules are as follows.  "bobcatgram" accumulates
42"bobabsyn" and "lambdayacc".  The other modules all accumulate the one
43listed before it:
44
45
46module "bobabsyn":
47  Defines the higher order abstract syntax for the bobcat language.  The
48  .mod file contains copy clauses for substitution.  The file also has
49  first-order alternatives (unused) for certain constructs.
50
51  It may seem strange that this module have abs constructors of type
52  (string -> texp) -> texp.  This was a somewhat arbitrary decision.
53  I could have used (texp -> texp) -> texp, but the string form
54  was just more convenient at the time, and it stuck.  Teyjus allows
55  the introduction of eigenvariables of type string without any problems,
56  and I don't think it makes that much of a difference.  After all, the
57  NAMES are what we want to abstract over.
58
59
60module "lambdayacc"
61  A deterministic, bottom-up parser generator.  Grammar symbols have type
62  "gs".  This module also contains a semi-customizable lexical scanner.
63
64
65module "bobcatgram"
66  The specification of the bobcat grammar and semantic actions of the parser.
67  The sig file contains the declarations of the grammar symbols used.
68
69
70module "bobparser"
71  The parser that's generated automatically from bobcatgram by lambdayacc.
72  bobparser produces a lambda syntax tree (type texp).  To get a good idea
73  of what kind of structure is produced, parse one of the simpler bobcat
74  programs with:
75
76     [bobparser] ?- parsefile "test4.bob" A.
77
78  There is no need to regenerate the parser unless changes are made to the
79  "cfg" clause in "bobcatgram".  Changes to semantic actions in "bobcatgram",
80  as long as they retain the same types, will not affect the parser.
81
82
83module "kitty"
84  Contains the definition of a continuation passing style intermediate
85  representation (type kexp) and the transformation from the lambda syntax
86  tree to CPS form.   This module is probably the most interesting component
87  of the present project.
88
89  To see a sample cps representation of a program, do
90
91  [kitty] ?- parsefile "test4.bob" (program A), formcps kret A B.
92
93  Variable B will be instantiated with the cps form of the program.
94
95  Be aware that the construct "kfix" doesn't exactly define functions,
96  but is used to abstract over names for local functions as well as variable
97  and class definitions.  "kfunc" is the construct that defines functions.
98  "kfix" should probably be called "klet".
99
100  Although if I were to do it over I would have cleaned up much of the
101  syntax, essentially this is correct, and relatively quite elegant
102  (especially when compared to the next module).  The style of CPS I use
103  is not traditional in the sense that every function is given an extra
104  argument, the continuation function.  Instead, I represent every function
105  call f(x) with (basically) a pair (f(x), K), where K is the continuation
106  function that the return value of f will be passed to.  In other words,
107  (let v = f(x) in (K v)).  There is literature that argues that this way
108  of doing things is better.
109
110  To be more precise, a function call f(a+b) is represented as
111  (opexp "+" a b (kabs u\ (karg u 0 \ (kcall f CN 1 K))))
112   where K is the overall continuation and CN is the class that f belongs.
113   to. CN="." indicates a global function.  (in this way type information is
114   carried in my CPS representation; it's necessitated by object orientation).
115
116
117module "absparc"
118  Contains another, very low level intermediate language "Abstract Sparc,"
119  which is machine dependent, but represents Sparc machine instructions in
120  abstract syntax.  For example, moving the contents of register "o1" to
121  register "l2" is represented by
122
123  movop (reg "o" 1) (reg "l" 2)
124
125  This gives me another level of flexibility to manipulate
126  the final code output.  I perform last-minute optimizations on this
127  representation, such as elementating obviously redundant operations like
128  in [movop A B, movop B A,...].  The mod file contains the code generation
129  (gencode) clauses that transform the cps language to the Abstract
130  Sparc language.  These clauses were by far the trickest to write,
131  largely due to the specific characteristics of the Sparc
132  architecture.  In some ways CPS can be thought of as a fancy form of
133  reverse-Polish notation, and naturally implemented on a simple
134  stack-based abstract machine.  But I decided to skip that and generate
135  machine-dependent code directly.
136
137  This module makes heavy use of cut (!) to model state.
138  This module can be cleaned up surely.
139
140  The most important type of this module is "store".  A store can be
141  a register, a memory location ("indirect"), a constant, or a label.
142  The most important predicates are gencode and genloadop.  The constructor
143  "assoces" associates expressions with stores containing them.  genloadop
144  (generate-load-operand) "loads" an expression into a register,
145  emitting the necessary instructions to do so at the same time.
146
147
148module "realsparc"
149  Contains the straight-forward translation from the abstract Sparc language
150  to real Sparc assembly language instructions as strings, which are
151  then written to a file.  This module also contains the toplevel predicate
152  "tigcompile", which is used as in:
153
154  [realsparc] ?- tigcompile "test3.bob" "test3.s".
155
156
157
158module "typecat"  (added 8/4/2001)
159  Type checking module for the lambda-syntax tree.  This module is currently
160  NOT hooked into the rest of the compiler (doesn't deal with oop yet).
161  In accumulates the bobparser module, and can be used independently as in
162
163  [typecat] ?- parsefile "test3.bob" (program A), typeexp A B.
164
165
166---
167
168List of sample bobcat program sucessfully compiled:
169
170test3.bob: contains demonstration of basic imperative programming techniques,
171  including recursive functions and loops.
172test4.bob: demonstrates mutually recursive functions
173test5.bob: demonstrates arrays (static arrays).  Arrays need to be redone.
174test6.bob: demonstrates static scoping with nested lets
175test7.bob: demonstrates object orientation with two classes.
176
177
178Known limitation of bobcat:
179
180Being experimental and under development, there are many restrictions in
181what can and can't be compiled:
182
183Type checking is not implemented, though some type information is used
184during compilation.  There is thus little error reportage.
185
186Functions can have only a limited number of local variables, since stack
187frames of fixed size (200 bytes) are created.  This can be solved of course,
188by counting the number of local variables of a function.  I regarged this
189as mundane and thus didn't implement it (at least not yet).
190
191Be aware: I have not tested functions that take more than 5 parameters: I'm
192not sure if they'll work.  The Sparc architecture has only a limited number
193of registers for parameter passing; more would require using the stack.
194
195A let statement can't just define a function without defining any variables
196first (stupid I know. Sorry).
197
198A function's parameters can't be changed.  This can be called a "feature"
199since it also appears in other languages (Ada, Eiffel).  Functions can be
200compiled more efficiently with this restriction.  Be aware however, that the
201restriction is not enforced.  no error is reported: it'll just crash!
202
203A class can only contain simple variables (no arrays, other objects).
204This restrictions was actually due to not enough type information used
205during compilation.  An array of objects also can not be created yet.
206
207All variables in a class need to be defined inside the "private" section,
208and all functions need to be in the "public" section.  The only way to
209access an object's internal state is through the functions.
210This restriction at least, is not necessarily a bad one.
211
212