(* CSC 123/252 Programming Assignment Online Calculator/Evaluator, version 7E5 This program contains the following components: 1. Definition of expression trees (type expr) for a simple calculator program 2. Evaluation of the expression trees into integers. Only some of the possible operations are supported - but the 'eval' function is a mutable lambda that we can modify by reassigning it. 3. A simple lexical analyser that takes a raw input stream and converts it into a list consisting of "tokens" Sym(string) or Val(int). This stage of building an interpreter/compiler is the most boring and does not showcase the specialties of F#. Not much time was spent on it and it has limitations. Some operators are spelled out so we have to write (3 eq 2+1) instead of (3==2+1) because the lexer can't distinguish between = and ==. We also use leq for <= and lt for <. 4. A shift-reduce, "operator precedence" parser that transforms a list of symbols produced by the lexer into an expression tree. 5. A "main" that reads input, calls the lexer, parser and evaluator. For your assignment, you should not touch components 1,3 and 5, but you will make modifications to parts 2 and 4: the parser and the evaluator. You will modify them without touching the base code using the technique demonstrated below for the 'eval' function. You must extend the program so that it can parse and evaluate the following kinds of expressions: a. The % operator for remainder, as in 4*11 % 5 should evaluate to 4. Currently, this operation is recognized by the evaluator (eval Remainder(a,b)) but not by the parser. You must extend the parser to handle this case but changing the mutable parse variable to a different lambda (like for eval). b. "Boolean" expressions from the operators eq, leq and lt, such 2+1 eq 3, 2 leq 1, and 4*5 lt 5*5. These expressions should evaluate to 0 for false and 1 for true (eval must always return an int or raise an exception). The expression tree type 'expr' defines cases for Equals, Lessthan and Lteq, but these are not recognized by either the parser or the eval function. For the parser, the precedence of all operators have already been defined as part of the "precedence" function. c. Ifelse expressions. In the abstract syntax, these appear as Ifelse(a,b,c) where a is a boolean that should eval to 0 or 1 and Ifelase(a,b,c) should eval to whatever b evals to if a evals to 1, or whatever c evals to if a evals to 0. It is every important that Ifelse, booleans and the operators can all be nested: if (3 eq 4) 1 else (if (4 eq 2+2) (8-1) else 3) should eval to 7. This just means that the eval function must be recursive. If-else should also be short circuited: if(0,a,b) should not eval a. The lexer recognizes "if", "else", "then", "begin" and "end" as keywords (it represents them as Sym("if"), Sym("else"), etc...) so you can choose your favorite way of writing if-else. I suggest you use mine, which inserts "else" just for readability (it has no other meaning). But note that every if must be matched with an else case. Write a separate program that starts and ends with: // at the start: open INTERPRETER; ... // at the end: main(); Compile yourprogram with fsharpc calculator.fs yourprogram.fs run: mono yourprogram.exe This assignment can be completed with about 25 lines of code. d. (Optional challenge). Think about how to represent let expressions: (let x=1:let y=2: x+y) should eval to 3. The abstract syntax type 'expr' contains a case for Var(string) and the lexical analyser translates single, lower-case letters such as "x" into Var("x"). You need to change the parser to recognize let expressions, but most importantly, you need to write a completely new eval function that take an additional argument that represents an "environment", that is, a list of bindings of ints to variables. You can use a .Net 'Dictionary' (hashmap) data structure to represent the environment (see sample use of Dictionary in the code below for 'prectable'). Alternatively you can use a list of pairs such as [("x",1); ("y",2)] You should think about how to respect the scope of let: let x=1:x + (let x=2:x+x) + x should eval to 6: the two x's outside the inner let should be bound to 1 while the inner x's should be bound to 2. If you can't write the entire program, just write down your thoughts in comments: the solution will be given away in the next assignment. *) module INTERPRETER open System; open Microsoft.FSharp.Math; open System.Text.RegularExpressions; open System.Collections.Generic; // for Dictionary (hashmap) ///////////////1. Abstract Syntax (expression trees) //// type expr = Val of int | Plus of (expr*expr) | Times of (expr*expr) | Minus of (expr*expr) | Neg of expr | Divide of (expr*expr) | Equals of (expr*expr) | Lessthan of (expr*expr) | Lteq of (expr*expr) | Power of (expr*expr) | Remainder of (expr*expr) |Ifelse of expr*expr*expr | Letexp of string*expr*expr | Var of string | Sym of string | EOF ;; // Sym and EOF are inserted into the abstract syntax for easier integration // with the parsing stage, because the parse stack needs to contain both // expressions and token symbols (non-terminal and terminal grammar symbols). // the function 'proper' below is used to distinguish Sym and EOF from other // expressions. //////////////2. Evaluator (Interpreter of expression trees) //////// // eval is mutable so we can easily extend it later... // mutable funs can't be recursive calls, unless we declare eval first: let mutable eval = fun (exp:expr) -> 0;; // types must match eval <- fun exp -> match exp with | Val(v) -> v | Plus(a,b) -> (eval a) + (eval b) | Times(a,b) -> (eval a) * (eval b) | Minus(a,b) -> (eval a) - (eval b) | Divide(a,b) -> (eval a) / (eval b) | Neg(a) -> -1 * (eval a) | x -> raise (Exception("eval case not supported: "+string(x)));; ///////IMPORTANT: ****** // We can "extend" the eval function modularly by reassigning it to // a new function that calls the original function in the base case. // the following adds new abilities to eval: let mutable base_eval = eval; eval <- fun exp -> // reassigning eval to a "new" lambda term match exp with | Power(a,b) -> int(Math.Pow(float(eval a), float(eval b))) | Remainder(a,b) -> (eval a) % (eval b) | x -> base_eval x;; // calls original function // Note that this works much better than defining a new function like 'eval2', // because other functions that call eval need not change at all. Furthermore, // the recursive calls to eval in base_eval is also changed, so Power, Remainder // expressions can be embedded within other kinds of expressions. This // is a little like using dynamic scoping to inject new behaviors into // functions, except here we're not just doing it locally. //////////////////////////////////////////////////// /////////////////3. LEXICAL ANALYSER (LEXER) // (ignore most here) ///////////// Lexical Symbol Specification // use regular expression to represent possible operator symbols (not all used) let mutable operators = "([()\+\-\*/:;%^,.]|\s|\[|\]|{|}|=)"; let mutable keywords = ["if";"then";"else";"while";"let";"eq";"leq";"lt";"print";"begin";"end";"lambda";"def";"cin";"cout"]; let mutable curops = operators.[0..operators.Length-2]; operators <- curops + "|" + "\"[^\"]*\"" + ")"; // Lexer reads expressions like "7+3*2" and converts them to list of symbols: // [|Val(7.0);Sym("+");Val(3.0);Sym("*");Val(2.0);EOF|]; //// This is the first stage of parsing, called lexical analysis. // now build list of tokens (only lower case chars are recognized as vars) let maketoken x = try Val(int(x)) // exception handling in F# with | excp -> match x with | y when (List.contains y keywords) -> Sym(y) | y when int(y.[0])>96 && int(y.[0])<123 -> Var(y) | y when y.[0]='\"' -> Sym(y.[1..y.Length-2]) | y -> Sym(y);; let tokenize (s2:string[]) = let rec itokenize ax = function // inner tail-recursive function | i when i>=0 -> let t = s2.[i].Trim() if (t<>"") then itokenize (maketoken(s2.[i])::ax) (i-1) else itokenize ax (i-1) | _ -> ax; itokenize [EOF] (s2.Length-1);; let mutable TS =[];; // global list of tokens let mutable TI = 0;; // global index for TS stream;; ///// The following function takes an input string and sets global // variable TS, which is a stream of tokens (see commented example above // for (7+3*2)). It also sets TI, which is a global index into TS. let mutable lexer = fun (inp:string) -> // main lexical analysis function let s2 = Regex.Split(inp,operators) TS <- tokenize s2 TI <- 0 // reset if needed // printfn "token stream: %A" TS;; ///////////////////////////4. ////////////////////////// SHIFT-REDUCE PARSER //////////////////////// // shift-reduce conflicts are broken by comparing operator precedence of // reducible construct with lookahead symbol. Percedences here are not // exact: for more exact precedence orderings, consult online sources like // https://hofstra.blackboard.com/ultra/courses/_57231_1/cl/outline // use hash table (.Net Dictionary) to associate each operator with precedence // Not every symbol here is used for the online calculator example. let prectable = Dictionary();; prectable.["+"] <- 200; prectable.["-"] <- 300; prectable.["*"] <- 400; prectable.["/"] <- 500; prectable.["("] <- 990; prectable.[")"] <- 20; prectable.[":"] <- 42; // trial by error... got to be careful prectable.["="] <- 30; prectable.["."] <- 20; prectable.["_"] <- 600; prectable.["eq"] <- 35; prectable.["leq"] <- 35; prectable.["lt"] <- 35; prectable.["if"] <- 20; prectable.["let"] <- 42; prectable.["while"] <- 40 prectable.["else"] <- 18; //20 prectable.["cin"] <- 100 // same as Val prectable.["cout"] <- 22; prectable.[";"] <- 20; prectable.["begin"] <- 20; prectable.["end"] <- 20; /// Lexical Token stream and abstract syntax combined into expr, the // following will try to distinguish them. // Proper expression check (shallow): separates proper expression from tokens // This is the price to pay for using strings: no compile-time verification let mutable proper = fun f -> match f with | Sym(_) -> false | EOF -> false | _ -> true;; // everything else is considered a proper expression // check if a list of expressions are all proper let rec all_proper n = match n with | [] -> true | (car::cdr) -> proper(car) && all_proper(cdr);; // note : short-circuited boolean makes this tail-recursive. // function defines precedence of symbol, which includes more than just Syms let mutable precedence = fun s -> match s with | Val(_) -> 100 | Var(_) -> 100 | Sym(s) when prectable.ContainsKey(s) -> prectable.[s] | EOF -> 10 | _ -> 11;; // Function defines associativity: true if left associative, false if right... // Not all operators are left-associative: the assignment operator is // right associative: a = b = c; means first assign c to b, then b to a, // as is the F# type operator ->: a->b->c means a->(b->c). let mutable leftassoc = fun e -> match e with | _ -> true; // most operators are left associative. // check for precedence, associativity, and proper expressions to determine // if a reduce rule is applicable. let mutable checkreducible = fun (a,b,elist) -> let (pa,pb) = (precedence(a),precedence(b)) ((a=b && leftassoc(a)) || pa>=pb) && all_proper(elist); // parse takes parse stack and lookahead; default is shift ////////////////// HERE IS THE HEART OF THE SHIFT-REDUCE PARSER //////// /// Must read the stack right to left let mutable parse = fun (x:expr list,expr) -> EOF // dummy for recursion parse <- fun (stack,lookahead) -> match (stack,lookahead) with | ([e],EOF) when proper(e) -> e // base case, returns an expression | (Sym(")")::e1::Sym("(")::t, la) when checkreducible(Sym("("),la,[e1]) -> parse (e1::t,la) // E->(E) | (e2::Sym("+")::e1::cdr,la) when checkreducible(Sym("+"),la,[e1;e2]) -> let e = Plus(e1,e2) in parse(e::cdr,la) // E->E+E | (e2::Sym("*")::e1::cdr,la) when checkreducible(Sym("*"),la,[e1;e2]) -> let e = Times(e1,e2) in parse(e::cdr,la) // E->E*E | (e2::Sym("-")::e1::cdr,la) when checkreducible(Sym("-"),la,[e1;e2]) -> let e = Minus(e1,e2) in parse(e::cdr,la) // E->E-E // this case doesn't distinguish between unary/binary minus | (e2::Sym("/")::e1::cdr,la) when checkreducible(Sym("/"),la,[e1;e2]) -> let e = Divide(e1,e2) in parse(e::cdr,la) // E->E/E | (e2::Sym("^")::e1::cdr,la) when checkreducible(Sym("^"),la,[e1;e2]) -> let e = Power(e1,e2) in parse(e::cdr,la) // E->E^E | (e1::Sym("-")::t, la) when checkreducible(Sym("-"),la,[e1]) -> // "rrc" parse (Neg(e1)::t,la) // E-> -E | (st,la) when (TI < TS.Length-1) -> // SHIFT TI <- TI+1; // increment global var let newla = TS.[TI] parse (la::st,newla) // push lookahead on stack, load next lookahead | (st,la) -> // Error let mutable ms = "parsing error: " for x in (la::st) do ms <- ms+string(x)+"::" raise (Exception(ms));; //"parsing error: "+string(la::st)));; ///////////////////////////////// ////// AOP-style "advice" to trace parse, eval functions //let mutable advice_trace = fun (target:unit->unit) -> // superceded // advice to trace before/after parse, eval // warning: using the advice cancels tail recursion optimization for parse let mutable traceopt = fun (before,after,target:unit->unit) -> let proceed_eval = eval let proceed_parse = parse eval <- fun e -> if before then printfn "evaluating %A" e let v = proceed_eval e if after then printfn " eval returned %d" v v // return parse <- fun(st,la) -> if before then printfn "parsing %A with lookahead %A" st la let e = proceed_parse(st,la) if after then printfn " parse returned expression %A" e e //return target() // execute target operation eval <- proceed_eval parse <- proceed_parse;; ///// ////////////////5. main // my main is mutable! let mutable main = fun () -> Console.Write("Enter Expression: ") let input_string = Console.ReadLine() lexer(input_string) // constructs global token stream TS let expr_tree = parse([],TS.[0]) let v = eval expr_tree let ps=(sprintf "\nValue of %s = %d\n" input_string v) Console.WriteLine(ps); //main(); // call and runs main --- call main after program's been extended //traceopt(false,true,main) // run main under after-trace advice