Martin Sulzmann
We consider the language of arithmetic expressions.
We wish to remove as many parantheses as possibles from expressions and assume the following precedence rules.
* binds tigher than +
+ binds tigher than <
< binds tigher than |
The “binds tigher than” relation is transitive. Hence, operation
*
binds tigher than operation |
.
Based on these precedence rules
There are two tasks.
Implement a “nicer” pretty printer
Implement a “better” parser
We repeat the Haskell representation of the language of arithmetic expressions.
Here is a simple pretty printer.
instance Show E where
show (N i) = show i
show (B True) = "t"
show (B False) = "f"
show (Plus e1 e2) = "(" ++ show e1 ++ " + " ++ show e2 ++ ")"
show (Mult e1 e2) = "(" ++ show e1 ++ " * " ++ show e2 ++ ")"
show (LessThan e1 e2) = "(" ++ show e1 ++ " < " ++ show e2 ++ ")"
show (Or e1 e2) = "(" ++ show e1 ++ " | " ++ show e2 ++ ")"
-- Short-hands
zero = N 0
one = N 1
true = B True
false = B False
-- Examples
ex = Plus (Mult one zero) one
ex1 = Mult zero (Plus zero one)
ex2 = Or (LessThan zero one) false
ex3 = Or (LessThan ex ex1) false
We find the following.
print ex ==> ((1 * 0) + 1)
print ex1 ==> (0 * (0 + 1))
print ex2 ==> ((0 < 1) | f)
print ex3 ==> ((((1 * 0) + 1) < (0 * (0 + 1))) | f)
This looks ugly! So many parantheses.
Write a nicer pretty printer where we remove as many parantheses as possibly by exploiting the above precedence rules.
We wish to obtain the following outputs.
print ex ==> 1 * 0 + 1
print ex1 ==> 0 * (0 + 1)
print ex2 ==> 0 < 1 | f
print ex3 ==> 1 * 0 + 1 < 0 * (0 + 1) | f
Point to note. Some parantheses are still necessary. See example
ex1
.
data Token = NUM Int
| BOOL Bool
| OPEN | CLOSE
| PLUS | MULT | LESSTHAN | OR deriving Show
type Tokens = [Token]
tokenize :: String -> Maybe [Token]
tokenize [] = Just []
tokenize (x:xs)
| x == ' ' = tokenize xs
| isDigit x = let (ds,ys) = collectDigits (x:xs)
in case tokenize ys of
Just ts -> Just (NUM (read ds :: Int) : ts)
f -> f
| otherwise = case lookup x prims of
Just t -> case tokenize xs of
Just ts -> Just (t:ts)
f -> f
Nothing -> Nothing
collectDigits xs = (takeWhile isDigit xs, dropWhile isDigit xs)
isDigit x = elem x ['0'..'9']
prims = [('(',OPEN), (')',CLOSE),
('+',PLUS), ('*',MULT),
('<', LESSTHAN), ('|', OR),
('t', BOOL True), ('f', BOOL False)]
parse :: String -> Maybe E
parse xs = do
ys <- tokenize xs
(zs,e) <- parseE ys
case zs of
[] -> return e
parseE :: Tokens -> Maybe (Tokens, E)
parseE xs = do (ys,e1) <- parseT xs
parseE2 ys e1
parseT :: Tokens -> Maybe (Tokens, E)
parseT xs = do (ys,e) <- parseF xs
parseT2 ys e
parseE2 :: Tokens -> E -> Maybe (Tokens, E)
parseE2 [] e1 = return ([],e1)
parseE2 (PLUS:xs) e1 = do (ys,e2) <- parseT xs
parseE2 ys (Plus e1 e2)
parseE2 (OR:xs) e1 = do (ys,e2) <- parseT xs
parseE2 ys (Or e1 e2)
parseE2 xs e = return (xs,e)
parseT2 :: Tokens -> E -> Maybe (Tokens, E)
parseT2 [] e1 = return ([],e1)
parseT2 (MULT:xs) e1 = do (ys, e2) <- parseF xs
parseT2 ys (Mult e1 e2)
parseT2 (LESSTHAN:xs) e1 = do (ys, e2) <- parseF xs
parseT2 ys (LessThan e1 e2)
parseT2 xs e = return (xs,e)
parseF :: Tokens -> Maybe (Tokens, E)
parseF [] = fail "Syntax error"
parseF (x:xs) =
case x of
BOOL b -> return (xs, B b)
NUM i -> return (xs, N i)
OPEN -> do (ys,e) <- parseE xs
case ys of
(CLOSE:zs) -> return (zs,e)
_ -> fail "Syntax error"
The above parser does not enforce the precdence rules.
parse "1 + 3 < 4" ==> Just (1 + (3 < 4))
But +
is meant to bind tighter than
<
.
Either
adapt the parser (or simply write a completely new parser) such that the precedence rules are enforced, or
clearly specify under which conditions parantheses are necessary.