Completing the Code Generator Due Friday 5/19 (this is it folks!) I've already talked about most of these points in class, but here are some notes on how to complete the code generator, assuming that you can already compile a main-only program. ----------------------- Point 1: Compiling a function. All functions (methods) belong to classes. Therefore, the proper label of a function should indicate the class that it belongs, for example: bankaccount_withdraw: This will distinguish it from functions of the same name in other classes. Remember to observe the functional calling/return protocol: classname_functionname: push %ebp mov %esp,%ebp sub $..., %esp # size depends on cmethod.locals.size() ... /* return: */ mov return value, %eax mov %ebp,%esp pop %ebp ret ------------------------ Point 2: Compiling a function call Function calls are of the form object.f(arg1,arg2,...); where arg1,arg2 is an explist. You need to write a loop to "accept" the args so that they will be in the right order on the stack. Please be careful: the explist structure is already reversed, but the accept method of explist will do an inverse loop - you need to do a forward loop because the list is already reversed. Since we are not going to implement dynamic dispatch, we only need to determine to what class a function belongs. However, since you did not implement the type checker, this will be difficult if "object" is a compound expression. If you have the type checker, your type checking visitor should store the type of each expression inside the expression node in the syntax tree (see me if you're not sure what this means). However, if you have not implemented the type checker, then the only function calls you can compile are the kind where "object" is a simple variable (varexp). In that case, you would look up the symbol table for the variable: visit(callexp x) { ... if (!(x.e1 instanceof varexp)) System.out.println("I didn't implement the type checker, sniff, sob"); ... varentry vt = (varentry)cmethod.lookup( ((varexp)x.e1).name ); ... Then look at the "type" field of vt for the name of the class: String classname = vt.type; Finally, you need to pass the address of the object (determined from vt.position) as either the implicit first argument, or using a reserved register such as EDI. Using a reserved register, however, also means that you'll have to save the register on the stack before changing its value. If you use EDI as "this", your code will have the general structure: push EDI mov address-of-object, EDI push argn ... push arg2 push arg1 call classname_functionname add $..., %esp pop EDI --------------------------- Point 3: Compiling a class Normally, compiling a class just involves compiling each method of the class. However, if you have code that instantiate instance variables as they're declared, as in class A { int x = 1; int y = ...; } then you need to make sure that, as an instance of the class is created, the initial values are calculated and stored in the correct memory addresses. One way to do this is to generate an implicit constructor function. If you look at the sample generated code from the bankaccount example, you 'll see the function bankaccount_bankaccount. This is the implicit constructor that is called whenever a "new bankaccount()" is invoked. What should the constructor do? First, you need to call malloc to allocate memory for the instance variables. Assuming that cclass points to the current symbol table entry for the class being compiled, the size of the fields hash table, cclass.fields, is how many variables you need to allocate memory for. That is ... visit(classdec x) { ... cmethod = null; // currently not in any class cclass = (classentry)top.lookup(x.classname); int objectsize = cclass.fields.size() * 4; Code.add(new operation("push", new immediate(objectsize))); Code.add(new operation("call malloc")); // return value will be in %eax ... then you need to compile each vardecstat inside the classdec, and make sure that the initial values are stored in the right locations in the object. Here, your code should be consistent with how you choose to represent the "this" pointer - either inside EDI or as the first parameter 8(%ebp). That is, if you load these locations with the return value of malloc (stored by malloc in eax), then compiling the vardecstats should not involve any extra work than calling accept on them. There is another compilication. Typically, we look up the symbol table for a variable from the cmethod pointer. However, now it is possible for there to be code that's not inside any method. Thus, when looking up the symbol table for a variable entry, you should change your code to do: varentry vt; if (!cmethod=null) vt = (varentry)cemthod.lookup(...); else vt = (varentry)cclass.lookup(...); ********** If you get really confused by this part, just punt and assume that instance variables cannot be initialized with values. However, you'll still have to call malloc to alloate memory for the object at some point. ********** ---------------------------- Point 3: Creating a new object If you compiled a class as described above, then creating an object should involve no more than calling the constructor of the class being instantiated. However, if you chose not to compile instance-variable initializations, you'll have to allocate memory for the object, which means you'll need to know how many variables there are for the object: classentry ct = (classentry)top.lookup(x.classname); int objectsize = ct.fields.size() * 4; ----------------------------- Point 4: Handling Arrays. Arrays are special objects in Java and are allocated on the heap. Furthermore, the length of the array is stored as part of the array. When creating a new array (visit(newarrayexp x)), you need to call malloc to allocate memory for the array based on the size of the array, plus an additional 4 bytes to store the length. Then you need to store the length inside the first 4 bytes of the array. You'll then also need to offset all array indices by 1, to skip the first 4 bytes, whenever the array is accessed. So far, we have only used memory operands of the form offset(register). But the x86 "CISC" instruction set contains another form, which was designed exactly with array access in mind: offset(baseregister, indexregister, scale) The address value is computed as baseregister+(indexregister*scale)+offset. For example, if the address of an array object A is current in EBX, then to access A[6] and put it in EAX, you can use mov $6,ECX mov 4(EBX,ECX,4), EAX Note that the offset of 4 is to skip the first 4 bytes, which contains the length. To create such a mem operand for the abstract assembly language, you can either define another constructor, or set the Index and Scale variables manually: mem m = new mem(EBX,4); m.Index=ECX, m.scale=4; ---------------- Please also see the miscellaneous notes on the web page for additional hints.