LR Parsing - Examples

Martin Sulzmann

Highlights

Example 1 - Arithmetic expressions

Let’s build the LR(0) item set for the following grammar.

S -> E
E -> id
E -> (E)
E -> E+E
E -> E*E

We start with

I0 : { S -> .E, E -> .id, E -> .(E), E -> .E+E, E -> .E*E }

Recall that the . indicates how much of a right-hand side we have already recognized. If the . hits a non-terminal, we pull in the respective rules for this non-terminal.

We build item sets by checking which ‘symbols’ can be consumed. A symbol can either be a terminal or a non-terminal (symbol, here you go we overload the term symbol).

Starting from I0, possible symbols to be consumed are: id, (, E. This results into the following three item sets.

consume id
I1: { E -> id. }

consume (
I2: { E -> (.E), E -> .id, E -> .(E), E -> .E+E, E -> .E*E }

consume E
I3: { S -> E., E -> E.+E, E -> E.*E }

Observations:

Let’s continue to build the remaining item sets and transitions. We write StartItem - symbol -> NextItem to indicate the resulting transitions of the FSA obtained by building item sets.

I0 - id -> I1
I0 - ( ->  I2
I0 - E ->  I3


I2 - id -> I1
I2 - ( -> I2
I2 - E -> I4

I3 - + -> I5
I3 - * -> I6

I4 - ) -> I7
I4 - + -> I5
I4 - * -> I6

I5 - id -> I1
I5 - (  -> I2
I5 - E  -> I7

I6 - id -> I1
I6 - (  -> I2
I6 - E  -> I8

I7 - +  -> I5
I7 - *  -> I6

I8 - +  -> I5
I8 - *  -> I6


I0 : { S -> .E, E -> .id, E -> .(E), E -> .E+E, E -> .E*E }

// I0 - id -> I1
I1: { E -> id. }

// I0 - ( -> I2
I2: { E -> (.E), E -> .id, E -> .(E), E -> .E+E, E -> .E*E }

// I0 - E -> I3
I3: { S -> E., E -> E.+E, E -> E.*E }

// I2 - E -> I4
I4 : { E -> (E.), E -> E.+E, E -> E.*E }

// I3 - + -> I5
I5 : { E -> E+.E, E -> .id, E -> .(E), E -> .E+E, E -> .E*E }

// I3 - * -> I6
I6 : { E -> E*.E, E -> .id, E -> .(E), E -> .E+E, E -> .E*E }

// I5 - E -> I7
I7: { E -> E+E., E -> E.+E, E -> E.*E }

// I6 - E -> I8
I8: { E -> E*E., E -> E.+E, E -> E.*E }

Observations.

How to resolve these conflicts?

  1. Rewrite into an unambiguous grammar!
S -> E
E -> T+E
E -> T
T -> F*T
T -> F
F -> id
F -> (E)
  1. Customize LR parser as follows:

This special case can be deal with in yacc by specifying that * has a higher precedence than +.

Example 2 - Dangling else

Recall

Stmt -> if E then Stmt else Stmt
Stmt -> if E then Stmt
Stmt -> other

This grammar is ambiguous. For example, consider

if expr then if expr then other else other

Disambiguation: “Match closest then”.

Stmt -> Match
Stmt -> Unmatch
Stmt -> other

Match -> if E then Match else Match
Match -> other

Unmatch -> if E then Stmt
Unmatch -> if E then Match else Unmatch

This grammar should be accepted by yacc, hence, belongs to the class LALR(1).