Truth Lab Stage 2, Parser Generator Stage 1 Part 1: Following the example and format of the calculator grammar, write a grammar for your truth table program. # Grammar for online calculator # The syntax nonterminal E Expr means that the semantic attribute associated # with the non-terminal symbol E is the java type Expr. nonterminal E Expr terminal ( ) + * - == typedterminal num Integer topsym E # top level grammar symbol # precedence and associativity declarations. # These rules must appear before the production rules. left * 500 left + 400 left - 400 left == 300 left ) 50 left ( 900 left num 980 # Grammar production rules, order breaks reduce-reduce conflicts # spaces must separate grammar elements, including spaces in { ... }. E --> num:x { return new intexp(x); } E --> E:e1 + E:e2 { return new sumexp(e1,e2); } E --> E:e1 * E:e2 { return new multexp(e1,e2); } E --> E:e1 - E:e2 { return new subexp(e1,e2); } E --> E:e1 == E:e2 { return new eqexp(e1,e2); } E --> - E:e { return new negexp(e); } E --> ( E:e ) { return e; } # ends file, whatever follows EOF is ignored and can be used for more comments. EOF /////////////// Part 2: Write a hand-written "metaparser" in java, FROM SCRATCH, for grammars written in this format. I found that the java Scanner class suffices for this purpose. The scanner usese white spaces as the default delimiter, but this can be changed - consult java API docs on how to use Scanner exactly. In my metaparser, I'm assuming that there are always white spaces between symbols, so that the user writing the grammar must insert white spaces to help me parse, including, for example, "{ return..." instead of "{return". I also do not allow line breaks in a grammar rule. You can make similar assumptions as long as you can parse grammars written in my format. To parse "E:e1", I created another scanner that uses ":" as delimiter. The grammar format makes it easy to hand-code a parser. The first symbol on each line determins how that line is to be parsed. The first symbol can be #, indicating a comment line to be ignored, a declaration "terminal", "nonterminal", "typednonterminal", "topsym", "left", "right", "EOF". Otherwise, the line must begin with a symbol that's been declared as a nonterminal, followed by " --> ". Your metaparser should output some error messages, such as when a terminal is used on the left-hand side of -->. Parse each grammar production rule into an instance of the following class: class Grule // grammar rule such as E -> E:a + E:b { Gsym lhs; // left-hand side non-terminal ArrayList rhs = new ArrayList(); // right-hand side of rule String action = "return null;\n}"; // default action closes delegate String operator = null; // used for operator precedence Ruleaction act; // delegate to be generated differently for each grammar } // "delegate" interface that represents production rule interface Ruleaction { Object compval(); // compute value for nonterminal to be pushed } The metaparser should leave *act* empty for now, as that will be filled-in during the parser code generation stage. The *action* can be just a string representation of the java code that will be executed verbatim. You may also design your own data structure to represent a grammar rule, but unless you can forsee what you need later in your parser generator, I recommend that you use my classes. The class Gsym, for each grammar symbol, is the following, which is consistent (backwards compatible) with the original version I gave you. I made it extend Comparable so I can insert into a sorted ArrayList of Gsyms, which makes it easier to avoid duplicates. public class Gsym implements Comparable { public final String sym; public String javatype = null; public boolean nonterminal = false; public String assoc = null; // label of object associated with symbol. public Object value = null; // assoc and value never used together: assoc in rule, value on stack public Gsym(String x) {sym=x;} public Gsym(String x, Object v) {sym=x; value=v;} public Gsym(String x, String y, boolean t) {sym=x; javatype=y; nonterminal=t;} public Gsym clone() { Gsym g = new Gsym(sym,javatype,nonterminal); g.assoc = assoc; g.value = value; return g; } public boolean equals(Object x) { return (x instanceof Gsym) && (sym.equals(((Gsym)x).sym)); } public int compareTo(Gsym x) { return sym.compareTo(x.sym); } }//Gsym Note the sym, javatype and assoc fields. These fields will be filled in by the metaparser. The javatype, for the calculator program will be either Expr or Integer, or null (for terminals that are not associated with values). The *assoc* field is important: this is the label that is associated with each symbol on the right-hand side (rhs) of a rule. For example, in E --> E:e1 + E:e2, the first Gsym on the right hand side will have assoc set to "e1" (and sym="E", javatype="Expr"). The *value* field of Gsym should NOT be set by the metaparser - that is a runtime value which is filled in as we parse a particular input. You should define data structures that lists all nonterminals and all terminals of the grammar. You should also devise a way to represent the operator precedence and associativity declarations (such as using a hash table). You will need, for example, a method that returns all productions with a certain non-terminal on the left-hand side (all productions for "E", for example). For efficiency, use indices instead of big lists of objects (but keep in mind object vars are just pointers in java). *************************************************************************** IMPORTANT: your metaparser can be more flexible than mine, but it MUST AT LEAST BE ABLE TO PARSE GRAMMARS WRITTEN WITH MY SYNTAX, so I can test it. *************************************************************************** ////////// Part 3: You need to implement procedures that compute the Nullible, First, and Follow sets for a grammar. Page 49 of your text has some pseudo code. However, instead of computing all three relations all at once, I recommend that you write three separate procedures: first compute the Nullible non-terminals for the grammar, then use that to compute the First sets, then use Nullible and First to compute the Follow sets. These can be tricky and I've devised a nasty little grammar to stress test your program. If you follow the algorithm of the pseudo code without taking shortcuts you should be fine. # test grammar for slr generator with difficult FIRST, FOLLOW computation nonterminal S Integer nonterminal T Integer nonterminal U Integer terminal x ( ) topsym S T --> S { return 2; } S --> T { return 3; } T --> ( { return 4; } S --> x { return 1; } T --> U x { return 5; } U --> ) T { return 6; } T --> { return 0; } EOF This grammar should have the following: Nullible Nonterminals: S T FIRST SETS: S : ( ) x T : ( ) x U : ) START : ( ) EOF x FOLLOW SETS: S : EOF x T : EOF x U : x START : Note that non-terminal START and terminal EOF were automatically added by my metaparser. Be careful: when inserting a symbol into an arraylist, make sure it's not a duplicate, otherwise your procedures will not terminate. I recommend using the java.util.TreeSet class of java, which has many of the operations that you'll need, including testing for set-equality. Using it is pretty straightforweard. Here's some code that I've tried. Be sure to consult the documentation on TreeSet and Set, however. TreeSet S = new TreeSet(); S.add(new Gsym("E","Expr",true)); S.add(new Gsym("2")); S.add(new Gsym("0")); S.add(new Gsym("1")); System.out.println(S.size()+","+S.first().sym); for(Gsym g:S) System.out.println(g.sym); // prints in sorted order /////////////////////////////////// NEW: /////////////////////////////////// To help you make progress, I'm pasting here my procedure to calculate the nullible set, which I call Epsilon. NOTE: this code is only intended as guideline. You'll have to change it to fit your program. // compute nullible set Epsilon public static void genepsilon() { Epsilon = new TreeSet(); boolean changed = true; while (changed) { changed = false; for (Grule r : Rules) // { boolean bx = true; // add or not for(Gsym g: r.rhs) { if (!g.nonterminal || !Epsilon.contains(g.sym)) bx=false; } if (bx) changed = Epsilon.add(r.lhs.sym) || changed; }// for each rule }//while }//genepsilon