Haskell project (“exp”)

Martin Sulzmann

Overview

We consider the language of arithmetic expressions.

We wish to remove as many parantheses as possibles from expressions and assume the following precedence rules.

* binds tigher than +
+ binds tigher than <
< binds tigher than |

The “binds tigher than” relation is transitive. Hence, operation * binds tigher than operation |.

Based on these precedence rules

There are two tasks.

  1. Implement a “nicer” pretty printer

  2. Implement a “better” parser

Implement a “nicer” pretty printer

We repeat the Haskell representation of the language of arithmetic expressions.

data E = N Int | Plus E E | Mult E E
       | B Bool | LessThan E E | Or E E

Here is a simple pretty printer.

instance Show E where
    show (N i) = show i
    show (B True) = "t"
    show (B False) = "f"
    show (Plus e1 e2) = "(" ++ show e1 ++ " + " ++ show e2 ++ ")"
    show (Mult e1 e2) = "(" ++ show e1 ++ " * " ++ show e2 ++ ")"
    show (LessThan e1 e2) = "(" ++ show e1 ++ " < " ++ show e2 ++ ")"
    show (Or e1 e2) = "(" ++ show e1 ++ " | " ++ show e2 ++ ")"

Examples

-- Short-hands
zero = N 0
one = N 1
true = B True
false = B False

-- Examples
ex = Plus (Mult one zero) one
ex1 = Mult zero (Plus zero one)
ex2 = Or (LessThan zero one) false
ex3 = Or (LessThan ex ex1) false

We find the following.

print ex    ==>  ((1 * 0) + 1)

print ex1   ==>  (0 * (0 + 1))

print ex2   ==>  ((0 < 1) | f)

print ex3   ==>  ((((1 * 0) + 1) < (0 * (0 + 1))) | f)

This looks ugly! So many parantheses.

Your task

Write a nicer pretty printer where we remove as many parantheses as possibly by exploiting the above precedence rules.

We wish to obtain the following outputs.

print ex    ==>  1 * 0 + 1

print ex1   ==>  0 * (0 + 1)

print ex2   ==>  0 < 1 | f

print ex3   ==>  1 * 0 + 1 < 0 * (0 + 1) | f

Point to note. Some parantheses are still necessary. See example ex1.

Implement a “better” parser

Simple top-down parser written by hand

data Token = NUM Int
           | BOOL Bool
           | OPEN | CLOSE
           | PLUS | MULT | LESSTHAN | OR deriving Show

type Tokens = [Token]

tokenize :: String -> Maybe [Token]
tokenize [] = Just []
tokenize (x:xs)
 | x == ' '  = tokenize xs
 | isDigit x = let (ds,ys) = collectDigits (x:xs)
               in case tokenize ys of
                    Just ts -> Just (NUM (read ds :: Int) : ts)
                    f       -> f
 | otherwise = case lookup x prims of
                Just t -> case tokenize xs of
                           Just ts -> Just (t:ts)
                           f       -> f
                Nothing -> Nothing


collectDigits xs = (takeWhile isDigit xs, dropWhile isDigit xs)
isDigit x = elem x ['0'..'9']
prims = [('(',OPEN), (')',CLOSE),
         ('+',PLUS), ('*',MULT),
         ('<', LESSTHAN), ('|', OR),
         ('t', BOOL True), ('f', BOOL False)]


parse :: String -> Maybe E
parse xs = do
  ys <- tokenize xs
  (zs,e) <- parseE ys
  case zs of
    [] -> return e

parseE :: Tokens -> Maybe (Tokens, E)
parseE xs = do (ys,e1) <- parseT xs
               parseE2 ys e1

parseT :: Tokens -> Maybe (Tokens, E)
parseT xs = do (ys,e) <- parseF xs
               parseT2 ys e

parseE2 :: Tokens -> E -> Maybe (Tokens, E)
parseE2 [] e1 = return ([],e1)
parseE2 (PLUS:xs) e1 = do (ys,e2) <- parseT xs
                          parseE2 ys (Plus e1 e2)
parseE2 (OR:xs) e1 = do (ys,e2) <- parseT xs
                        parseE2 ys (Or e1 e2)
parseE2 xs e          = return (xs,e)

parseT2 :: Tokens -> E -> Maybe (Tokens, E)
parseT2 [] e1 = return ([],e1)
parseT2 (MULT:xs) e1 = do (ys, e2) <- parseF xs
                          parseT2 ys (Mult e1 e2)
parseT2 (LESSTHAN:xs) e1 = do (ys, e2) <- parseF xs
                              parseT2 ys (LessThan e1 e2)
parseT2 xs e          =  return (xs,e)

parseF :: Tokens -> Maybe (Tokens, E)
parseF [] = fail "Syntax error"
parseF (x:xs) =
  case x of
   BOOL b -> return (xs, B b)
   NUM i -> return (xs, N i)
   OPEN -> do (ys,e) <- parseE xs
              case ys of
                (CLOSE:zs) -> return (zs,e)
                _          -> fail "Syntax error"

Your task

The above parser does not enforce the precdence rules.

parse "1 + 3 < 4"   ==>   Just (1 + (3 < 4))

But + is meant to bind tighter than <.

Either