# Sample Grammar for matching brackets (), [] and {} # the "absyntype" is the rust type of the value returned by the parse function. # in this example, the absyntype represents the number of pairs of matching # brackets of types (), [] and {} absyntype (u32,u32,u32) # the absyntype can be any type that implements the Default trait. # (u32,u32,u32) defaults to (0,0,0) # This grammar contains 3 non-terminal symbols, although the parser generator # may create additional symbols. nonterminals E S WS # Nonterminals and terminals must be declared. terminals ( ) [ ] LBRACE RBRACE Whitespace # topsym is the top or start symbol of the grammar topsym S # on error, the parser will skip past the next whitespace and keep parsing resync Whitespace # The following are the grammar production rules. E --> ( S:(a,b,c) ) {(a+1,b,c)} E --> [ S:(a,b,c) ] {(a,b+1,c)} E --> LBRACE S:(a,b,c) RBRACE {(a,b,c+1)} S --> WS { (0,0,0) } S --> S:(a,b,c) E:(p,q,r) {(a+p,b+q,c+r)} WS --> WS --> WS Whitespace # The following declarations generate a lexical scanner "bracketslexer" # using the built-in StrTokenizer, which takes strings or text files as input. # Other tokenizers can also be used by adopting them to the Tokenizer trait. lexname LBRACE { lexname RBRACE } lexvalue Whitespace Whitespace(_) (0,0,0) lexattribute keep_whitespace = true # 'lexname' defines the lexical (textual) form of terminal symbols that would # otherwise clash with reserved symbols used in rustlr grammars. # 'lexvalue' defines that a "RawToken" of the form Whitespace(_) should be # translated to the terminal symbol 'Whitespace' with the semantic value # (0,0,0)). 'lexattribute' sets other options of the lexer (see tutorial for details). !//anything on a ! line is injected verbatim into the generated parser ! !fn main() { ! let argv:Vec = std::env::args().collect(); // command-line args ! let mut parser1 = make_parser(); ! let mut lexer1 = bracketslexer::from_str(&argv[1]); ! let result = parser1.parse(&mut lexer1); ! if !parser1.error_occurred() { ! println!("parsed successfully with result {:?}",&result); ! } ! else {println!("parsing failed; partial result is {:?}",&result);} !}//main EOF Everything after "EOF" is ignored and can be used for more comments: Each grammar rule has a an associated "semantic action" enclosed in {}'s. The semantic action is a piece of rust code that must return a value of the declared abysyntype. If no action is specified, one will be created that returns absyntype::default() - the chosen absyntype must implement the Default trait. Although not used in this grammar, multiple productions for the same left-hand side nonterminal may be specified on one line using the | character. It is also possible to specify a rule on multiple lines using ==> instead of --> and ending with <== Note that {, } and | cannot also be used as terminal symbols. The grammar must specify other names such as LBRACE and RBRACE and the lexical scanner must translate "{" and "}" into these tokens. Rustlr defines a 'Tokenizer' trait and allows the use of any lexical analyzer (tokenizer) that implements the trait. It can also generate a usable tokenizer automatically from the grammar and from lexname, lexvalue and lexattribute declarations. In this example, we've used the ! lines to inject enough code into the generated parser so it's a self-contained program. Normally, however, only a few ! lines are needed for 'use' type declarations. Each grammar symbol, terminal as well as nonterminal will always have associated with it a value of absyntype. When non are specified, absyntype::default() is automatically used. We can match the value with a simple pattern (without whitespaces) such as (a,b,c). Most often, however, only one variable name is used. The semantic action has access to these values when forming the return value. The semantic action is otherwise injected verbatim into the generated parser: rustlr is not responsible if Rust cannot compile your semantic actions. To generate a parser for this grammar with the rustlr executable (cargo install rustlr): > rustlr brackets.grammar lalr This produces a file bracketsparser.rs. Since this file contains a main, you can cargo build a new crate and replace main.rs with the contents of the parser file. Be sure to insert rustlr = "0.2" under [dependencies] in the crate's Cargo.toml.