RUSTLR: Bottom-Up Parser Generator for Rust, Version 0.3.x


The ultimate goal of Rustlr is to round up all the hoofed herbivores into a tool that's both usable and enjoys the expressive power of LR grammars and beyond. It's been decades since Donald Knuth proved that every deterministic context free language has an LR(1) grammar. However, such theoretical properties never settled the dispute between LR-style parer generators and those based on other techniques. One reason is that users of LR parsing have always faced a steep learning curve. How an LR parser works is not as intuitive as, for example, a recursive descent parser. To alleviate this problem Rustlr implements a collection of features including:

Rustlr defines a trait that allows the use of any lexical analyzer as long as it's adopted to the trait. However, rustlr also provides a general purpose, zero-copy lexer that suffices for many examples. A lexer specific to a grammar can be automatically generated from a minimal set of declarations.

With future releases, Rustlr will also be able to generate parsers for languages other than Rust. With version 0.3.7, it can generate a parser for F#, although not all capabilities are yet available. F# is the .Net version of Ocaml and lacks options when it comes to parser generation. Rustlr will mainly target typed, functional languages that support algebraic types and pattern matching. The documentation format on docs.rs is a good technical reference but does not serve as a tutorial.

This tutorial is evolving as Rustlr is being enhanced with new features. The project aims to be backwards compatible as much as possible.


Tutorial by Examples

The tutorial is organized around a set of examples, starting with the simplest, with each example explaining a set of more advanced features. All features of rustlr will eventually be explained as you progress through the examples. It would be helpful if the reader is familiar with some basic bottom-up parsing concepts, such as those covered in typical compiler texts.

The chapters in bold listed below are complete. The others provide additional examples and generally contain enough comments to be readable. The latest and most advanced features of Rustlr are described in Chapter 4 and in the Appendix.

  1. Chapter 1: Classic calculator, unambiguous LALR.
  2. Chapter 2: Enhanced calculator with more advanced features, including interactive training for error reporting.
  3. Chapter 3: Semantic actions returning multiple value types. (older version)
  4. Chapter 4: Automatically Generating the AST
  5. Chapter 5: Using Regex Operators *, + and ? in Grammars
  6. Chapter 6: Generating Bump-allocated ASTs that enable recursive pattern matching
  7. Chapter 7: Error Recovery Options
  8. Advanced Example: Building a Parser for C. (under construction). link to crate
  9. Special Example: Non-context free language, using External State. Link to full crate
  10. Special Chapter: Generating a Parser for F#
  11. Appendix: Experimental Features (Delayed Reductions and the Wildcard Token)
  12. Additional Full Example: Yacc converter. Create with rustlr grammar that builds a parser for converting Yacc grammars to Rustlr format, stripping away all C declarations and actions.
  13. Additional Full Example: Lambdascript. Program implementing and tracing beta-reduction steps for the untyped lambda calculus. This crate was created using rustlr and this grammar.
  14. (Deprecated versions of chapter 1 and chapter 2)

    Additional Grammars (with no semantic actions)

  15. LR(1) but non-LALR(1) grammar
  16. LALR Grammar for full Java version 1.4
  17. ANSI C grammar (adopted from yacc syntax)

References