Haskell: All about functions and lists

Martin Sulzmann

Haskell

Lot’s of online resources Learning Haskell

Great tutorial Learn You a Haskell

Great books

We will use the GHC compiler system. Specifically, we’ll use ghci which is GHC’s interactive environment (in essence an interpreter).

Download GHC (includes ghci) here

Of course, there are some fancy Haskell IDEs out there. Check here

and you can even run Haskell in your browser. Check out Haskell Online

For our purpose, all we need is:

ghci Demo

:l FILENAME
:q
:t EXPRESSION

Further Details on ghci

Further details

Prelude> (\x -> x)

<interactive>:2:1:
    No instance for (Show (t0 -> t0))
      (maybe you haven't applied enough arguments to a function?)
      arising from a use of ‘print’
    In a stmt of an interactive GHCi command: print it
Prelude> map (+1) [1,2,3]
[2,3,4]
Prelude> :t map
map :: (a -> b) -> [a] -> [b]
Prelude> :t 1
1 :: Num a => a
Prelude> 1 + True

<interactive>:6:3:
    No instance for (Num Bool) arising from a use of+
    In the expression: 1 + True
    In an equation for ‘it’: it = 1 + True
Prelude> :t (\(x:xs) -> x ++ xs)

<interactive>:1:18:
    Occurs check: cannot construct the infinite type: a ~ [a]
    Expected type: [a]
      Actual type: [[a]]
    Relevant bindings include
      xs :: [[a]] (bound at <interactive>:1:6)
      x :: [a] (bound at <interactive>:1:4)
    In the second argument of ‘(++)’, namely ‘xs’
    In the expression: x ++ xs

Let’s look at the expression (\(x:xs) -> x ++ xs) in more detail.

Prelude> :t (++)
(++) :: [a] -> [a] -> [a]

Motivation: Functional abstraction

Recall sorting

quicksort gtThan [] = []
quicksort gtThan (p:xs) = (quicksort gtThan lesser) ++ [p] ++ (quicksort gtThan greater)
    where
        ltEqThan = \x -> \y -> not (gtThan x y)
        lesser = filter (gtThan p) xs
        greater = filter (ltEqThan p) xs

Sorting Boolean values (“True > False”)

sortBools = quicksort (\x -> \y -> x && (not y))

In the above, we can see

Content

Quick start

Expression-based and implicit types

Haskell looks different.

example1 x = let y = x
             in x + y

Numbers are overloaded

example2 :: Num a => a -> a
example2 x = let y = x
             in x + y

Numbers and associated operations like +, -, … are overloaded. More on this later

Pure versus impure functions

Pure

example3 :: String -> String
example3 x = let y = x ++ "Hallo"
             in y

Pure = Functions behavior solely defined by its inputs

A pure function will always compute the same result for the same set of inputs (no side effects)

Impure

Impure = not pure some side effects

We may obtain different results/behaviors for the same sets of inputs.

example4 :: String -> IO String
example4 x = do let y = x ++ "Hallo"
                z <- getLine
                print (y++z)
                return y

Function definitions

We consider the case of functions that operate on simple types (Integer and Bool).

isEven :: Integer -> Bool
isEven n
  | n `mod` 2 == 0  = True
  | otherwise       = False
isEven2 :: Integer -> Bool
isEven2 n =
   if n `mod` 2 == 0
   then True
   else False

Function composition

We can compose functions via the familiar syntax know from math. Given functions f and g, we can define f . g as the composition where (f . g) x is equivalent to f (g (x)).

Example.

isOdd :: Integer -> Bool
isOdd = not . isEven

Error messages

*Main> isEven -2

<interactive>:31:8:
    No instance for (Num (Integer -> Bool))
      arising from a use of `-'
    Possible fix:
      add an instance declaration for (Num (Integer -> Bool))
    In the expression: isEven - 2
    In an equation for `it': it = isEven - 2
*Main> isEven (-2)
True

Functions with local definitions (let)

isEven3 :: Integer -> Bool
isEven3 n = let m = n `mod` 2
                p = m == 0
            in if p then True
               else False
isEven3a :: Integer -> Bool
isEven3a n = let m :: Integer
                 m = n `mod` 2
                 p :: Bool
                 p = m == 0
             in if p then True
                else False

Functions with local definitions (where)

isEven4 :: Integer -> Bool
isEven4 n = if p then True
            else False
            where
              m = n `mod` 2
              p = m == 0

Partial function application

Consider

plus x y = x + y

The plus function takes two integer values and adds them up. How to call plus?

To add 1 to 2 we call

plus 1 2

which then yields

3

Function application in Haskell is written by applying the function to its argument separated by white-space.

There are two arguments. Hence, we call plus 1 2.

In languages such as C and Java you are used to provide all arguments when calling a function. For example, plus(1,2). But this does not correspond to the definition of plus here.

Recall

plus x y = x + y

The function plus expects one argument at a time. Hence, the call

plus 1 2

corresponds to

(plus 1) 2

What is the intermediate expression plus 1? This is a partially applied function where the first argument x is bound to 1.

Can we make use of partial functions? Yes, consider the following alternative definition of inc.

plus x y = x + y

inc = plus 1

Pretty cool!

Let’s compare the function calls

plus 1 2

and

(plus 1) 2

Function application is left associative and therefore the call plus 1 2 is actually a short-form for (plus 1) 2.

Lambda functions

Haskell supports anonymous functions also known as “lambda functions”. Lambda functions are functions but they are not associated to a function name.

Here’s an example of a function with a name.

inc x = x + 1

How to define functions without a name (lambda functions)? What makes a function? A function takes some input (function parameter) and yields some output (function body).

In Haskell, the notation for lambda functions (lambdas for short) is as follows.

\x -> x + 1

The above is equivalent to inc.

Lambdas are values and can appear anywhere. The following definition is effectively equivalent to inc.

inc2 = \x -> x + 1

Here’s another example.

f x y = x + y

versus

f2 = \x -> (\y -> x + y)

The body of the outer lambda expression yields another lambda function!

We try to remove parentheses from program to avoid clutter. We adopt the convention that a lambda body extends furthest to the right. Hence,

\x -> \y -> x + y

is a short-form for

\x -> (\y -> x + y)

Higher-order and first class functions

Higher-order function refers to a function where functions may appear as arguments as well as in the result.

First class functions means that functions are treated like values.

Having first class functions usually implies that the language supports higher-order functions. Haskell supports both directly.

The C language supports function pointers. Does this imply that C supports first-class functions? Note quite. Functions can indeed be arguments and return values in C but there are restrictions on how to construct functions. For example, in C all function definitions appear at the same scoping level. Supporting first class functions usually means that you can construct functions locally (like values within a function). This is not possible in C.

Here are some function examples in Haskell.

apply f x = f x

inc3 = apply (\x -> 1 + x)


f x y = let g = \z -> z + x
        in (g 1) + (g y)

Tuples, currying and partial application

What about defining a function like in C where all arguments must be provided in “one go”. This is possible as Haskell supports tuples

Consider

add (x,y) = x + y

Tuples can be used as values but also as patterns in function definitions.

The function pattern add (x, y) = indicates that the argument of add is a tuple (here pair) where x refers to the left component and y refers to the right component.

We can call add as follows.

add (1,3)

Here, (1,3) is a tuple value and the argument on which we call add.

Is there any connection between plus and add? We repeat the definitions of both but use lambda notation.

plus = \x -> \y -> x + y

add = \(x,y) -> x + y

The behavior of both functions is equivalent. They both sum up two values. But the usage is different as observed above.

The interesting thing is that plus can be transformed into add and vice versa. This transformation process is called (un)currying.

curry :: ((a, b) -> c) -> a -> b -> c
curry f = \x -> \y -> f (x,y)
uncurry :: (a -> b -> c) -> (a, b) -> c
uncurry f = \(x,y) -> f x y

We find:

We say that add is the uncurried version of plus and plus is the curried version of add.

From a user perspective, it is generally preferable to define functions in “curried style”. For example, consider

apply = \f -> \x -> f x
plus = \x -> \y -> x + y

inc = apply (plus 1)

We can easily define inc in terms of plus by defining inc = plus. The use of apply is a bit contrived.

On the other hand, the “uncurried style” comes with less overhead (partial function application and closures can be avoided) and is generally more efficient.

Functions versus operators

Consider

plus = \x -> \y -> x + y

plus is a function and + is an operator. The difference is that for function application we use prefix notation whereas for binary operators such as + we use infix notation.

Compare plus 1 2 versus 1 + 2.

In Haskell, an operator like + can be turned into a function by wrapping the operator inside parentheses. Instead of

1 + 2

we can write

(+) 1 2

Similarly, a function can be turned into an operator by quoting the function. Instead of

plus 1 2

we can write

1 `plus` 2

Partial function application and operators

Instead of

(+) 1

we can write the shorter form

(1+)

We can even write

(+2)

which is equivalent to

\x -> x + 2

Further short forms.

(1 `plus`)

is equivalent to

plus 1

and so on.

Lists

Besides tuples, lists are another data structure natively supported by lists.

Lists are homogeneous. Elements must be of the same type

[Int] denotes a list of integer elements

A string is just a list of characters

[[Char]] denotes a list of list of characters (so a list of strings)

The size of a list is only bounded by the available memory.

List construction

Elements are separated by comma

Surrounded by square brackets [...]

[] represents the empty list

Here are some simple examples.

l1 :: [Int]
l1 = [1,2,3,4]

l2 :: [[Char]]
l2 = [['a','b','c'], ['d'], []]

List operators

We can add an element to an existing list via the :, pronounced “cons”, operator.

1 : [2,3,4]

yields

[1,2,3,4]

Every list can be represented by combinations of : and [].

The above list corresponds to

1 : (2 : (3 : ( 4 : [])))

As it is common in Haskell, we wish to remove parentheses and therefore the above can also be written as

1 : 2 : 3 : 4 : []

The ++, pronounced “concat”, operator allows us to append two lists.

*Main> [True] ++ [False, True]
[True,False,True]

Recall that infix operators can be used like functions

*Main> (:) True  [False, True]
[True,False,True]

*Main> (++) [True]  [False, True]
[True,False,True]

Built-in list functions

Via head we can extract the leading element, pronounced the “head”, from a list.

Via tail we drop the head element and extract the remaining list, pronounced the “tail”.

Examples.

Prelude> head [1,2,3]
1
Prelude> tail [1,2,3]
[2,3]

Both functions assume that the input is non-empty. What happens if we apply both functions to an empty list?

Prelude> head []
*** Exception: Prelude.head: empty list
Prelude> tail []
*** Exception: Prelude.tail: empty list

head and tail are examples of partial functions. Partial functions are not defined for all possible (shapes of) inputs. In contrast, a total function yields an output for each input (an exception does not count as output).

To check if a list is empty, Haskell supports the null function.

Prelude> null []
True
Prelude> null [1,2,3]
False

Functions over lists

Based on the built-in functions

null -- test if list is empty
head -- yields first element
tail -- drop first element

we can define a function to sum up the elements in a list

sumNums xs
  | null xs = 0
  | otherwise = head xs + sumNums (tail xs)
*Main> sumNums [1,2,3]
6

Pattern matching over list arguments

In Haskell, we can define functions that observe the “shape” of the input. If the input is the empty list do this, otherwise do that.

To observe the shape of the input, Haskell offers pattern matching. Pattern matching is more powerful than just distinguishing between empty and non-empty lists. We can also use pattern variables that match (parts of) the incoming input.

Here’s recast of the sum function using pattern matching.

sumNumsPat [] = 0
sumNumsPat (x:xs) = x + sumNums xs

For finite lists, we can use the patterns [x], [x1,x2], …

Example.

sumNumsPat [] = 0
sumNumsPat [x] = x
sumNumsPat [x1,x2] = x1 + x2
sumNumsPat (x:xs) = x + sumNums xs

Case expressions

Instead of patterns in function clauses we can use case expressions.

sumNumsPat ys =
  case ys of
    []     -> 0
    (x:xs) -> x + sumNums xs

Evaluation with pattern matching

Function calls are matched against their function equations but starting from top to bottom and left to right (in case there are several arguments).

Here is a same sample evaluation. Suppose we wish to evaluate sumNumsPat [1,2,3]. The individual evaluation steps are:


    sumNumsPat [1,2,3]     -- (1)
=   1 + sumNumsPat [2,3]   -- (2)
=   1 + 2 + 3              -- (3)
=   6

Pattern matching continued

We can also pattern match against Integer and Booleans.

isZero :: Int -> Bool
isZero 0 = True
isZero _ = False

myDiv n 0 = error "division by zero"
myDiv n m = n `div` m

myAnd False _ = False
myAnd True x  = x

_ is the “don’t care” pattern that matches anything.

Patterns are tried in textual order (from top to bottom)

isZero :: Int -> Bool
isZero 0 = True
isZero _ = False


isZero2 :: Int -> Bool
isZero2 _ = False
isZero2 0 = True

isZero2 always yields False!

Exercise: Last element

Write a function which extracts the last element from a list.

Hint: You’ll need some finite list patterns.

Exercise: Last element (2)

Solution:

lastElem [x] = x
lastElem (x:xs) = lastElem xs

We can evaluate by hand our solution via some equational reasoning

lastElem [1,2,3] =
--  clause lastElem (x:xs) applies
-- x bound to 1 and xs bound to [2,3]
lastElem [2,3] =
-- clause lastElem (x:xs) applies again
lastEleme [3]
-- base case lastElem [x] applies
3
lastElem []
*** Exception: Basics.hs:(117,1)-(118,29): Non-exhaustive patterns in function lastElem

Run-time exception there’s no case that deals with the empty list

Lists - summary

Values.

[]                     -- empty list
[1,2,3]                -- list with three elements

1:2:3:[]               -- writing the same list differently
1:[2,3]


: is the 'cons' operator,
takes an element and a list and yields a new list


1:2:3:[] to be interpreted as

1:(2:(3:[]))

Patterns.

A pattern describes a set of values.


[1]
[1,2]
...

Values are patterns.
The set of values described by a (pattern) value is singleton.

More interesting patterns.

[]                     -- empty list
[x,y]                  -- list with two elements
...
(x:xs)                 -- non-empty list with head x and tail xs

Values can be matched against patterns.

Matching the value [1,2,3] against (x:xs)
yields the pattern binding

x is bound to 1
xs is bound to [2,3]

We cannot match [1,2,3] against [x,y]!

Pattern variables must be linear.
That is, for each pattern, variable names in that
pattern are distinct.

Further list operators.


xs ++ ys               -- concatenation
head (x:xs) = x

Example

        head [3,4,5]
     =  head (3:[4,5])

        matches left-hand side of above function equation
        x bound to 3
        xs bound to [4,5]

        replace matching left-hand side by right-hand side

     => 3

What about the following?

    head []

There’s no match. This function call can’t be evaluated. Haskell will issue an error.

This shows that Haskell functions don’t need to be total.

last [x]    = x
last (x:xs) = last xs

Example

        last [3,4,5]
      = last (3:[4,5])
     => last [4,5]
      = last (4:[5])
     => last [5]
      = 5

Above also shows that function equations are selected from top to bottom. Notice that the left-hand sides last [x] and last (x:xs) overlap for the special case of a singleton list.

Recall that [x] can be written as x:[].

“count” example

Example makes use of


-- | Counting words.
-- Word = Sequence of characters not separated by white spaces (blanks).
count :: String -> Int
count [] = 0
count (c:cs)
  | c == ' ' = count $ skipBlanks cs
  | otherwise = 1 + count (skipWord cs)

-- | Generic skip function.
skip :: (Char -> Bool) -> String -> String
skip p [] = []
skip p (c:cs)
 | p c       = skip p cs
 | otherwise = c:cs

skipWord :: String -> String
skipWord   = skip (/= ' ')

skipBlanks :: String -> String
skipBlanks = skip (== ' ')


{-
Equivalent to skip.
1. Uses case expressions.
2. If-then-else.
-}

skip2 :: (Char -> Bool) -> String -> String
skip2 p s =
 case s of
   [] -> []
   (c:cs) -> if p c
             then skip2 p cs
             else c:cs

The $ operator

($) f x = f x

Thanks to $ we can omit some parantheses.

For example,

count (skipBlanks cs)

can be (equivalently) written as

count $ skipBlanks cs

Basic list combinators - map, filter

As we have seen, many Haskell functions are recursive and have a parametric type.

Thankfully, we rarely need to write recursive functions ourselves. Rather, we use an existing set of combinators.

A combinator is a primitive function specially designed for a specific task.

Here, we take a look at the combinators map and filter

Map

Exercise: Provide two functions which implement the above tasks

Map (2)

By implementing the above exercise we encounter a certain recursion pattern.

We seek for a map combinator which behaves as follows

map f [x1,...,xn] ==> [f x1, ..., f xn]

The definition in Haskell is as follows:

map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs

Map (3)

mapAddOne :: [Int] -> [Int]
mapAddOne = map (+1)

At first sight, this looks pretty dense. Don’t worry, it just takes a bit of time to get used to this style of programming

mapAddOne :: [Int] -> [Int]
mapAddOne xs = map (+1) xs

Map (4)

mapSquare :: [Int] -> [Int]
mapSquare = map square
            where
              square x = x * x

The above should be now less surprising.

The local function square could of course also be defined globally.

In fact, in Haskell it’s not necessary to give the square function a name. Haskell supports the concept of anonymous functions and we could specify the above as follows.

mapSquare :: [Int] -> [Int]
mapSquare = map (\x -> x * x)

\x -> x * x denotes an anonymous functions which takes as an argument x and builds the square of x

This is also called a lambda abstraction.

Map (5)

map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
mapSquare :: [Int] -> [Int]
mapSquare = map (\x -> x * x)

Sample evaluations

mapSquare [1,2] => map (\x -> x * x) [1,2]
                => (\x -> x * x) 1 : map (\x -> x * x) [2]
                => 1 * 1 : map (\x -> x * x) [2]
                => 1 : map (\x -> x * x) [2]
                => 1 : ((\x -> x * x) 2) : map (\x -> x * x) []
                => 1 : 4 : map (\x -> x * x) []
                => 1 : 4 : []
                 = [1,4]
mapSquare (mapSquare [1,2]) => ... => [1,16]

Seems tedious. We iterate over the list twice!

map obeys some general laws:

map f (map g) = map (f . g)
mapSquare (mapSquare [1,2]) => map (\x -> x * x) (map (\x -> x * x) [1,2])

Recall, function application is left associative

f e1 e2 = (f e1) e2

Thus

mapSquare (mapSquare [1,2]) => map (\x -> x * x) (map (\x -> x * x) [1,2])
                            = map (\x -> x * x) ((map (\x -> x * x)) [1,2])
                            = (map (((\x -> x * x) . (\x -> x * x))) [1,2]
                            = map (\x -> x * x * x * x) [1,2]

Cool! High-level optimizations thanks to high-level programming.

Filter

Exercise: Provide two functions which implement the above tasks

Filter (2)

Here is the general filter combinator

filter :: (a -> Bool) -> [a] -> [a]
filter p [] = []
filter p (x:xs)
  | p x       =  x : filter p xs
  | otherwise =  filter p xs

And the implementations of the above two tasks

filterNonNeg :: [Int] -> [Int]
filterNonNeg = filter (>=0)

filterPair :: [(Int,Int)] -> [(Int,Int)]
filterPair = filter (\(x,y) -> x+y == 0)

Point to note: Pattern matching (over pairs) also applies to anonymous functions

Recursion patterns summary

We have seen

There exist many more recursion patterns (e.g. zip, fold, …)

Most often, a Haskell program consists of combinations of carefully chosen combinators.

For map and filter, Haskell provides a more intuitive syntax known as list comprehensions.

List comprehensions

General syntax

[ f x | x <- xs, p x]
mapAddOneIfNonNeg :: [Int] -> [Int]
mapAddOneIfNonNeg xs = [ x+1 | x <- xs, x >= 0]

Lists comprehensions vs map and filter

Each list comprehension can be transformed into map/filter and vice versa.

l1 :: [Int]
l1 = [1,2,3,4]

lc1 = [ (+1) x | x <- l1,  x > 2 ]

lc2 = map (+1) (filter (>2) l1)

List comprehensions further examples

We can combine several generators.

flatten :: [[a]] -> [a]
flatten xss = [ x | xs <- xss, x <- xs ]

The above flattens a list of lists into a list

For example

*Main> flatten [[1], [], [2,3]]
[1,2,3]

Haskell provides a primitive concat which achieves the same

More list combinators - zip, fold

Zipping two lists

Challenge

Define a function which behaves as follows

f [x1, ..., xn] [y1, ..., yn] ==> [f x1 y1, ..., f xn yn]

Zip

Makes a list of tuples, each tuple containing elements of both lists occurring at the same position

zip :: [a] -> [b] -> [(a,b)]
zip _ [] = []
zip [] _ = []
zip (x:xs) (y:ys) = (x,y):zip xs ys

Applied to the above challenge

combMap f xs ys = map (\(f,y) -> f y) (zip (map f xs) ys)

testcombMap = combMap  (\x -> \y -> (x,y)) [1..3] [5..7]

Folding over a list

-- if the list is empty, the result is the initial value z; else
-- apply f to the first element and the result of folding the rest
foldr f z []     = z
foldr f z (x:xs) = f x (foldr f z xs)
-- if the list is empty, the result is the initial value; else
-- we recurse immediately, making the new initial value the result
-- of combining the old initial value with the first element.
foldl f z []     = z
foldl f z (x:xs) = foldl f (f z x) xs

Code/Function reuse

map :: (a -> b) -> [a] -> [b]
or :: [Bool] -> Bool
any :: (a -> Bool) -> [a] -> Bool
any p = or . map p

FP princple = Write functions by composing existing functions

Recall that . stands for function composition

But isn’t the above inefficient? It seems that we iterate twice over the list. For example, consider

any (\x -> x > 0) [1,2,3,4]

Good news. Haskell does only the necessary work. Haskell applies an outermost-leftmost evaluation strategy (call-by name)

Here’s a sample “call-by name” evaluation

any (\x -> x > 0) [1,2,3,4]
 => or . map (\x -> x > 0) [1,2,3,4]
 => or (((\x -> x > 0) 1) : map (\x -> x > 0) [2,3,4])
 => or (True : map (\x -> x > 0) [2,3,4])
 => True

In contrast, “call-by value” processes the entire list

any (\x -> x > 0) [1,2,3,4]
 => or . map (\x -> x > 0) [1,2,3,4]
 => or (map (\x -> x > 0) [1,2,3,4])
 => or ( (1 > 0) : map (\x -> x > 0) [2,3,4])
 => or ( True : map (\x -> x > 0) [2,3,4])
 =>* or [True, True, True, True]
 => True

`=>* denotes a sequence of evaluation steps

Summary

Main concepts

Partial function application:

plus x y = x + y
inc x = plus 1 x

or even shorter

inc = plus 1

It’s possible to provide all function arguments using tuple notation.

plus2 (x,y) = x + y

Function plus is written in curried style where plus2 is the uncurried version. Both styles can be transformed into each other via

uncurry plus ==> plus2

curry plus2 ==> plus

Anonymous (lambda) functions:

inc2 = \x -> x + 1

Higher-order functions:

apply = \f -> \x -> f x
inc = apply (+1)

map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
addOneToAll = map (+1)

Haskell in one example:

filter :: (a -> Bool) -> [a] -> [a]
filter p [] = []
filter p (x:xs)
  | p x       =  x : filter p xs
  | otherwise =  filter p xs

Use case of map and filter:

f xs = map (1+) (filter (>=0) xs)

What?

In more detail.

gt x = x >= 0
inc x = 1 + x

f1 xs = map inc (filter gt xs)

As we might use gt and inc only once, let’s use anonymous functions (lambdas) instead.

f2 xs = map (\x -> 1 + x) (filter (\x -> x >= 0) xs)

In Haskell we can use operators like functions (and vice versa).

f3 xs = map (\x -> (+) 1 x) (filter (\x -> (>=) x 0) xs)

Notice the “partial” function applications which can be expressed via the following syntactic sugar.

f4 xs = map (\x -> (1+) x) (filter (\x -> (>=0) x) xs)

The “lambda” is now redundant!

f5 xs = map (1+) (filter (>=0) xs)

Equality.

isEq x y = x == y

Haskell will infer the following type.

isEq :: Eq t => t -> t -> Bool

This is a “generic” function which will work for values of any type t as long as equality is provided for that type.

Equality is provided for all built-in types.

It’s possible to define equality for user-defined types via type classes.

Int versus Integer.

listOfNumbers = [1,2]

Haskell infers the following type.

listOfNumbers :: [Integer]

Note that Int (fixed precision) is different from Integer.

If we want a list of Ints instead, we must provide an explicit type annotation.

listOfInts :: [Int]
listOfInts = [1,2]

Writing tests

Copy the following code

type Assertion = IO ()
assertEqual :: (Eq a, Show a) => String -> a -> a -> Assertion
assertEqual what expected real =
    if expected == real
    then putStrLn ("[ok] " ++ what)
    else fail
             (what ++ ": assertEqual failed. Expected: " ++ show expected ++
               ", given: " ++ show real)

Define tests like this:

test_valid :: Assertion
test_valid =
    do assertEqual "valid1" True db1
       assertEqual "valid2" False db2
allTests :: Assertion
allTests =
   do test_valid

Then in ghci:

*DB> allTests
[ok] valid1
[ok] valid2

Exercise - Database (functions over lists)

We are given a database of students. Your task is to implement a number of query operations. Code template below (Database.hs). We’ll give a brief description.

A student is represented by her name, student id and a list of the courses the student is taking.

type Student = (String, Int, [Int])
type DB = [Student]

Databases must be consistent. We say the database is consistent if there’re no multiple entries of students, and no multiple entries of courses per students For example, the below databases are inconsistent

incons1 = [("Jack", 111, [141, 252, 141])]
incons2 = [("Jane", 112, [141, 252]), ("Jane", 112, [141, 252])]

Your task is to implement the following function which returns True if the database is valid (consistent), otherwise the function returns False.

valid :: DB -> Bool

Extension: We strengthen the notion of consistency. In addition, we require that there shall by no duplicate student id’s. For example,

incons3 = [("Jane", 111, [141]), ("Jack", 111, [141, 252])]

FROM HERE ON, WE WILL ASSUME THAT ALL DATABASES ARE CONSISTENT !!!

Given a database and a student id, we’re looking for the list of courses of this particular student.

query1 :: DB -> Int -> [Int]

Given a database and a course, find all students taking this course.

query2 :: DB -> Int -> [String]

Given a database, sort the database (in non-decreasing order) according to the name of students.

sortDB :: DB -> DB

Extension1: Provide a function sortDB’ which sorts the database according to the number of courses a student is taking

Extension2: Provide a function sortDB’’ which is parameterized in the actual comparing relation which determines the sorting order For example: Given

cmpName :: Student -> Student -> Ordering
cmpName (n1, _, _) (n2, _, _) =
 if n1 < n2
 then LT
 else if n1 == n2
      then GT
      else EQ

Then you can define

 sortDB = sortDB'' cmpName

Given two databases, merge them to obtain one (consistent) database

merge :: DB -> DB -> DB

Example:

 merge [("Jane", 112, [141, 353])] [("Jane", 112, [141, 252])]
    => [("Jane", 112, [141, 252, 353])]

What if the database is not consistent? Consider

 merge [("Jane", 112, [141, 353])] [("Jane", 113, [141, 252])]
    => ????

Recall: FROM HERE ON, WE WILL ASSUME THAT ALL DATABASES ARE CONSISTENT !!!

You may want to consider a variant of merge that deals with the possibility of an inconsistent database.

In case of inconsistencies, you may

  1. remove duplicates,
  2. issue a run-time error,
  3. use “Maybe”

Database.hs



-------------------------------------------------------
-- Extended Haskell programming exercise
-- Topic: functions over lists
-- Author: Martin Sulzmann
-------------------------------------------------------



-- A student is represented by her name, student id and
-- a list of the courses the student is taking


type Student = (String, Int, [Int])
type DB = [Student]



-- TASK 0
{-
Databases must be consistent.
We say the database is consistent if there're no multiple entries
of students, and no multiple entries of courses per students
 For example, the below databases are *inconsistent*
-}

incons1 = [("Jack", 111, [141, 252, 141])]
incons2 = [("Jane", 112, [141, 252]), ("Jane", 112, [141, 252])]

{-
Your task is to implement the following function
which returns True if the database is valid (consistent),
otherwise the function returns False.
-}
valid :: DB -> Bool
valid = error "Your code"


-- EXTENSION TO TASK 0
{-
Extension: We strengthen the notion of consistency.
In addition, we require that there shall by no duplicate student id's.
For example,
-}

incons3 = [("Jane", 111, [141]), ("Jack", 111, [141, 252])]


-- FROM HERE ON, WE WILL ASSUME THAT ALL DATABASES ARE CONSISTENT !!!


-- TASK 1
{-
Given a database and a student id, we're looking for the list of
courses of this particular student.
-}
query1 :: DB -> Int -> [Int]
query1 = error "Your code"


-- TASK 2
{-
Given a database and a course, find all students
taking this course.
-}
query2 :: DB -> Int -> [String]
query2 = error "Your code"



-- TASK 3
{-
Given a database, sort the database (in non-decreasing order)
according to the name of students.
-}
sortDB :: DB -> DB
sortDB = error "Your code"

{-
Extension1:
Provide a function sortDB' which sorts the database according to the number of courses a student is taking

Extension2:
Provide a function sortDB'' which is parameterized in the actual comparing relation which determines the sorting order
For example:
 Given

cmpName :: Student -> Student -> Ordering
cmpName (n1, _, _) (n2, _, _) =
 if n1 < n2
 then LT
 else if n1 == n2
      then GT
      else EQ

Then you can define

 sortDB = sortDB'' cmpName

-}


-- TASK 4
{-
Given two databases, merge them to obtain one (consistent) database
 Example:

 merge [("Jane", 112, [141, 353])] [("Jane", 112, [141, 252])]
    => [("Jane", 112, [141, 252, 353])]

-}

merge :: DB -> DB -> DB
merge = error "Your code"

Sample solution(s)

Sample solution WS20/21 (developed in class, incomplete)


-- this is a one line comment
-- isEven :: Integer -> Bool
{-  multiple line comment

-}

--------------------
-- Type inference in Haskell


-- Convention: Provide the types of top-level functions
--
-- isEven :: Integer -> Bool
--
-- the above is a valid type but not the most general type that can be given to isEven.
--
-- The below is the most general type.
-- If you don't believe, just comment out the type declaration.
isEven :: Integral a => a -> Bool
isEven n
  | n `mod` 2 == 0  = True
  | otherwise       = False


-- Terminology.
--
--    isEven :: Integer -> Bool
--
-- The above is a type declaration. Other names used are
-- type annotation, type signature, ...

-- Another example.

plus :: Integer -> Integer -> Integer
-- plus :: Num a => a -> a -> a
plus x y = x + y


--------------------
-- DB exercise

type Student = (String, Int, [Int])
type DB = [Student]


incons1 :: DB
incons1 = [("Jack", 111, [141, 252, 141])]


incons2 :: [([Char], Int, [Int])]
incons2 = [("Jane", 112, [141, 252]), ("Jane", 112, [141, 252])]

student1 :: Student
student1 = ("Jack", 111, [141, 252, 141])

{-

DB is consistent iff

(1)   student ids are distinct, and

(2)  for each student, course ids are distinct

=> need to check if there are duplicates in a list

=>

(1) Need to extract all student ids, and then apply checkDuplicates

-}


-- True = DB is consistent

valid :: DB -> Bool
valid db =   valid1 db && valid2 db

-- Check for consistency condition 1
valid1 :: DB -> Bool
valid1 db = not (checkDuplicates [ x  | (_,x,_) <- db ])

-- Check for consistency condition 2
valid2 :: DB -> Bool
valid2 db = not (or [ checkDuplicates cs | (_,_,cs) <- db ])


{-
myOr [] = False
myOr (True:_) = True
myOr (False:xs) = myOr xs

-}

{-
-- yet to be written function
-- True = there are duplicates
checkDuplicates :: [Int] -> Bool
checkDuplicates [] = False
checkDuplicates (x:xs)
  | elem x xs = True
  | otherwise =  checkDuplicates xs

-}

checkDuplicates :: [Int] -> Bool
checkDuplicates [] = False
checkDuplicates (x:xs) = not (null (filter (\y -> y == x) xs)) || checkDuplicates xs

{-
checkDuplicates (x:xs)
  | not (null (filter (\y -> y == x) xs)) = True
  | otherwise                             = checkDuplicates xs
-}

-- checkDuplicates (x:xs) = not (null (filter (==x) xs)) || checkDuplicates xs


-- myElem x xs = not ( null (filter (\y -> y == x) xs))

Sample solution WS19/20 (developed in class, incomplete)


type Student = (String, Int, [Int])
type DB = [Student]

exDB1 :: DB
exDB1 = [("Helmut", 1112, [141, 252]),
         ("Emma", 1234, [252, 353]),
         ("Emma", 1235, [252, 353])]

-- examples for inconsistencies

-- student ids must be distinct!
exDB2 :: DB
exDB2 = [("Helmut", 1112, [141, 252]),
         ("Emma", 1234, [252, 353]),
         ("Emma", 1112, [252, 353])]


-- course ids must be distinct (per student)!
exDB3 :: DB
exDB3 = [("Helmut", 1112, [141, 252, 141]),
         ("Emma", 1234, [252, 353]),
         ("Emma", 1235, [252, 353])]


-- subtask: check if there are duplicate elements in a list
-- idea: for each element check if the element is part of the tail.

duplicate [] = False
duplicate [x] = False
duplicate (x:xs) = if b
                   then True
                   else duplicate xs
                    where b = elem x xs
{-
duplicate (x:xs) = if elem x xs
                   then True
                   else duplicate xs
-}
{-
  | elem x xs = True
  | otherwise = duplicate xs
-}


valid1 :: DB -> Bool
valid1 db = not (duplicate (map (\(_,i,_) -> i) db))


valid2 :: DB -> Bool
valid2 db =   and [ not (duplicate cs) | (_,_,cs) <- db ]

valid :: DB -> Bool
valid db = valid1 db && valid2 db

query1 :: DB -> Int -> [Int]
query1 [] _ = []
query1 ((_,i,cs):db) j
  | i == j    = cs
  | otherwise = query1 db j


flatten :: [[a]] -> [a]
flatten xss = [ x | xs <- xss, x <- xs ]

query1b :: DB -> Int -> [Int]
query1b db j =   flatten (map (\(_,_,cs) -> cs) (filter (\(_,i,_) -> i == j) db))

query1c :: DB -> Int -> [Int]
query1c db j = flatten [cs | (_,i,cs) <- db, i == j]

query2 :: DB -> Int -> [String]
query2 db c = [n | (n,_,cs) <- db, elem c cs]

query2b :: DB -> Int -> [String]
query2b db c = map (\(n,_,_) -> n) (filter (\(_,_,cs) -> elem c cs) db)

query2c :: DB -> Int -> [String]
query2c [] _ = []
query2c ((n,_,cs):db) c
  | elem c cs = n : query2c db c
  | otherwise = query2c db c

Sample solution A (developed in class, incomplete)

type StudentID = Int
type Student = (String, StudentID, [Int])
type DB = [Student]

{-
We say the database is consistent if
(a) there're no multiple entries of students, and

         => Student ID must be distinct

(b) no multiple entries of courses per students

         => Course ID must be distinct

Insight:

  Need a function to check that if a list of values
  there are no duplicates

duplicateCheck Algorithm

Given a list of n elements [x1,...,xn]

For each element xi check if xi appears in x_i+1 ... xn

-}


duplicates :: Eq t => [t] -> Bool
duplicates [] = False
-- duplicates [x] = False
-- duplicates [x,y] = x == y
duplicates (x:xs) =
    -- filter out all elements in the tail xs equal to x
    let ys = filter (==x) xs
             -- filter (\y -> y == x) xs
    in if ys /= []
       then True
       else duplicates xs


valid :: DB -> Bool
valid db =
   -- Subcheck (a)
   -- not (duplicates (map (\(x,y,z) -> y) db))
   let check1 = not (duplicates ([y | (x,y,z) <- db]))

       check2 = [] ==
                (filter (==True)
                       (map duplicates
                            (map (\(_,_,courses) -> courses) db)))

       -- more efficient, we stop once we discover the first inconsistency
       check2b = not (or
                       (map duplicates
                            (map (\(_,_,courses) -> courses) db)))


   in check1 && check2

-- Find all students with invalid course lists
invalidStudent :: DB -> [Int]
invalidStudent db =
              map (\(id,_) -> id)
              (filter (\(_,x) -> x == True)
              (map (\(_,id,courses) -> (id, duplicates courses)) db))

-- find all the names of students with an invalid course list,
-- make use of invalidStudent
--
-- for each id in (invalidStudent db)
--  find (id',name,_) in db such that id' == id
query db = [ name | id <- invalidStudent db, (name,id2,_) <- db, id == id2 ]


-- accumulate all courses


accumulate [] = []
accumulate ((_,_,courses):db) = courses : accumulate db



-- test cases
incons2 :: DB
incons2 = [("Jane", 112, [141, 252]), ("Jane", 112, [141, 252])]


incons3 :: DB
incons3 = [("Jane", 112, [141, 252]), ("Jane", 112, [141, 252]),
           ("Joe", 113, [141, 252, 141])]

incons3b :: DB
incons3b = [("Joe", 113, [141, 252, 141]),
            ("Jane", 112, [141, 252]),
            ("Jane", 112, [141, 252])]

incons3c :: DB
incons3c = [("Joe", 113, [141, 252, 141]),
            ("Jane", 112, [141, 252]),
            ("Jane", 112, [141, 252]),
            ("Jack", 115, [252, 252])]

cons4 :: DB
cons4 = [("Jane", 112, [141, 252]), ("Jane", 114, [141, 252]),
           ("Joe", 113, [141, 252, 147])]

Sample solution B



-------------------------------------------------------
-- Extended Haskell programming exercise
-- Topic: functions over lists
-- Author: Martin Sulzmann
-- YOUR TASK: complete the following exercises
-------------------------------------------------------

import Data.List(sortBy,sort)

-- We're give a database of students and implement a
-- number of query operations


-- A student is represented by her name, student id and
-- a list of the courses the student is taking


type Student = (String, Int, [Int])
type DB = [Student]



-- TASK 0
-- We have to ensure that the database is consistent.
-- We say the database is consistent if there're no multiple entries
-- of students, and no multiple entries of courses per student
-- For example,
--   [("Jack", 111, [141, 252, 141])],  and
--   [("Jane", 112, [141, 252]), ("Jane", 112, [141, 252])]
-- are inconsistent


-- The function valid returns True if the database is valid, otherwise
-- we return False
valid :: DB -> Bool

-- sample solution

valid db = valid1 db && valid2 db
-- valid1 checks for multiple student entries
valid1 db =
        -- 1. Extract the names of all students
        -- 2. Iterate over all student names and check if there are duplicate entries
        let students = map (\(s,_,_) -> s) db  -- 1
        in  and (map (\s -> [s] == (filter (\s' -> s == s') students))  -- 2
                     students)

-- From the above we can derive a general check for 'no duplicates'
noDuplicates xs =
  and (map (\x -> [x] == (filter (\x' -> x == x') xs))
           xs)

-- valid2 checks for multiple course entries
valid2 db =
       and (map (\(_,_,courses) -> noDuplicates courses) db)



-- EXTENSION TO TASK 0
-- In addition to the two consistency criteras above, we require that there are no
-- duplicate student id's
-- For example,
--      [("Jane", 111, [141]), ("Jack", 111, [141, 252])
-- is inconsistent
-- Write a function valid' which takes all three consistency criteras into account



-- FROM HERE ON, WE WILL ASSUME THAT ALL DATABASES ARE CONSISTENT !!!


-- TASK 1
-- Given a database and a student id, we're looking for the list of courses of
-- this particular student

query1 :: DB -> Int -> [Int]

-- sample solution
query1 [] _ = []
query1 ((_, id, crs):rest) stdid
   | id == stdid = crs
   | otherwise   = query1 rest stdid

-- some other solutions
query1' db stdid = head [ crs |  (_, id, crs) <- db, id == stdid]

query1'' db stdid = head (map (\(_,_,crs) -> crs)
                              (filter (\(_,id,_) -> id == stdid) db))


-- TASK 2
-- Given a database and a course, find all students who're taking this course

query2 :: DB -> Int -> [String]

-- sample solution
query2 [] _ = []
query2 ((name, _, crs):rest) course
   | elem course crs = name:(query2 rest course)
   | otherwise       = query2 rest course


-- another solution
query2' db course = query2Acc db course []
     where
      query2Acc [] course students = students
      query2Acc ((name, _, crs):rest) course students
           | elem course crs = query2Acc rest course (name:students)
           | otherwise       = query2Acc rest course students

-- What's the difference between query2 and query2'?


-- TASK 3
-- Given a database, sort the database (in non-decreasing order)
-- according to the name of students
sortDB :: DB -> DB
sortDB db = sortBy (\(s1,_,_) -> \(s2,_,_) ->
                    if s1 == s2 then EQ
                    else if s1 < s2 then LT
                         else GT)
                   db
-- in the above we make use of the sortBy function which takes as its first argument
-- a function to compare elements

-- EXTENSION TO TASK 3
-- provide a function sortDB' which sorts the database according to the number of courses
-- a student is taking
-- MORE DIFFICULT EXTENSION:
-- provide a function sortDB'' which is parameterized in the actual comparing relation
-- which determines the sorting order
-- For example:
-- Given
--  cmpName :: Student -> Student -> Bool
--  cmpName (n1, _, _) (n2, _, _) = n1 <= n2
-- Then you can define
-- sortDB = sortDB'' cmpName



-- TASK 4
-- Given two databases, merge them to obtain one (consistent) database
-- Example:
-- merge [("Jane", 112, [141, 353])] [("Jane", 112, [141, 252])]
--    =>  [("Jane", 112, [141, 252, 353])]
-- merge [("Jane", 112, [141, 353])] [("Jane", 113, [141, 252])]
--    => Error: Inconsistent ID's for student Jane

merge :: DB -> DB -> DB
merge db1 db2 = merge' (sortDB db1) (sortDB db2)
                where
                  merge' [] db = db
                  merge' db [] = db
                  merge' ((n1,i1,c1):db1) ((n2,i2,c2):db2)
                    | n1 == n2   = (n1,i1,rmDups (c1++c2)) : merge' db1 db2
                    | n1 < n2    = (n1,i1,c1) : merge' db1 ((n2,i2,c2):db2)
                    | n1 > n2    = (n2,i2,c2) : merge' ((n1,i1,c1):db1) db2


rmDups xs = rmDups' [] xs
rmDups' acc []     = acc
rmDups' acc (x:xs)
  | elem x acc = rmDups' acc xs
  | otherwise  = rmDups' (x:acc) xs