Overview

Evaluation order
- Call-by value, call-by name, …
- Haskell applies “lazy” evaluation (implementation method for call-by name)
Higher-order functions
- How to compile higher-order functions
- From higher-order to first-order

Evaluation order and parameter passing

Expression evaluation
- Innermost first
- Outermost first
Calling strategies (parameter evaluation)
- Call-by value (strict evaluation)
- Call-by reference
- Call-by name (copy)
As we will later see, Haskell is lazy (non-strict)

Evaluation order examples (innermost)

square x = x * x
inc x = x + 1
main = square (inc 1)

Look for a redex innermost, leftmost
Also called Applicative Order Reduction (AOR)
Redex = reducible expression of the form (\x-> ...) e, arithmetic term such as 1 + 3 etc

square (inc 1) => square (1 + 1)
               => square 2
               => 2 * 2
               => 4

Call-by value (CBV) = AOR with the exception that we don’t evaluate under “lambda” (i.e. inside function bodies)

Evaluation order examples (outermost)

square x = x * x
inc x = x + 1
main = square (inc 1)

Outermost, leftmost, also called Normal Order Reduction (NOR)

square (inc 1) => (inc 1) * (inc 1)
               => (1 + 1) * (inc 1)
               => 2 * (inc 1)
               => 2 * (1 + 1)
               => 2 * 2
               => 4

Call-by name (CBN) = NOR with the exception that we don’t evaluate under “lambda” (i.e. inside function bodies)

Evaluation order summary

CBN seems rather inefficient
But has plenty of applications
- More code reuse
- More declarative code
- Infinite, circular data structures
- …

CBN - More code reuse

Given

map :: (a -> b) -> [a] -> [b]
or :: [Bool] -> Bool

Check if any element in a list satisfies a predicate

any :: (a -> Bool) -> [a] -> Bool
any p = or . map p

FP princple = Write functions by composing existing functions

Compare CBN and CBV for

any (\x -> x > 0) [1,2,3,4]

Recall that . stands for function composition

CBN - More code reuse (2)

map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
or :: [Bool] -> Bool
or [] = False
or (True:xs) = True
or (x:xs) = or xs
any :: (a -> Bool) -> [a] -> Bool
any p = or . map p

CBN evaluation does only the necessary work

any (\x -> x > 0) [1,2,3,4]
 => or . map (\x -> x > 0) [1,2,3,4]
 => or (((\x -> x > 0) 1) : map (\x -> x > 0) [2,3,4])
 => or (True : map (\x -> x > 0) [2,3,4])
 => True

CBN - More code reuse (3)

map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
or :: [Bool] -> Bool
or [] = False
or (True:xs) = True
or (x:xs) = or xs
any :: (a -> Bool) -> [a] -> Bool
any p = or . map p

CBV evaluation processes entire list

any (\x -> x > 0) [1,2,3,4]
 => or . map (\x -> x > 0) [1,2,3,4]
 => or (map (\x -> x > 0) [1,2,3,4])
 => or ( (1 > 0) : map (\x -> x > 0) [2,3,4])
 => or ( True : map (\x -> x > 0) [2,3,4])
 =>* or [True, True, True, True]
 => True

`=>* denotes a sequence of evaluation steps

CBN - More code reuse (4)

CBN is more efficient and supports code reuse
Under CBV we we might do some unnecessary work!
To obtain a more efficient version under CBV, we will have to inline the code (no reuse)

any :: (a -> Bool) -> [a] -> Bool
any p [] = False
any p (y:ys) = y || any p ps

CBN - More code reuse (5)

Here’s another example.

myElem :: Eq a => a -> [a] -> Bool
myElem x [] = False
myElem x (y:ys)
  | x == y    = True
  | otherwise = myElem x ys


myElem2 :: Eq a => a -> [a] -> Bool
myElem2 x xs =  (not . null) (filter (==x) xs)

Under CBN, myElem and myElem2 are equally efficient!

myElem2 reuses existing code/functions.

CBN - More declarative code

Suppose we are given a list with one million entries
Retrieve the smallest ten entries with doing as little work as possible
The ‘brute-force’ method is to sort the entire list and then select the first 10 entries
Under CBN, the solution is trivial

sortN n xs = take n (sort xs)

CBN - Infinite structures

ones = 1 : ones

nats = nats' [1] ones
nats' _ [x] = [x]
nats' prev (x:xs) = sum prev : nats' (prev++[x]) xs

General principle

Generate
Select

CBN allows for highly declarative control structures

Streams


-- Haskell supports lazy evaluation.
-- Lots of cool applications but can also bite us!

---------------------------------------------------------------------------------------------
-- Stream, infinite sequence of data (numbers, temperature reading, internet traffic, ...)

-- Simplified stream example.
data Stream a = Cons a (Stream a) deriving Show
-- Compare this to lists.
data List a = ConsL a (List a) | Null
-- Observation:
-- Every value of type List a is finite!
-- How to emulate the stream data type say in a Java like language?
data JavaStr a = ConsS a (() -> JavaStr a)
--                       ^^^^^^^^^^^^^
--                  suspended computation
--                  () -> Str a   refers to a parameter-less function that yields a stream
--                  thus we can suspend the computation (and emulate lazy evaluation in a strict language)

headS :: Stream a -> a
headS (Cons x _) = x

headStr :: JavaStr a -> a
headStr (ConsS x _) = x

tailS :: Stream a -> Stream a
tailS (Cons x xs) = xs

tailStr :: JavaStr a -> JavaStr a
tailStr (ConsS _ f) = f ()   -- force the computation, i.e. take one step

succN :: Integer -> Stream Integer
succN n = Cons n (succN (n+1))

-- Natural numbers: 0, 1, ...
nats :: Stream Integer
nats = succN 0

mapS :: (a -> b) -> Stream a -> Stream b
mapS f (Cons x xs) = Cons (f x) (mapS f xs)

-- Negative numbers: -1, -2, ...
negs :: Stream Integer
negs = mapS (\x -> -x) (tailS nats)

-- interleaveS applied to
--     ... x0, x1, ...
--     ... y0, y1, ...
-- yields
--     ... x0,y0,x1,y1, ...
interleaveS :: Stream a -> Stream a -> Stream a
interleaveS (Cons x xs) ys = Cons x (interleaveS ys xs)

-- All positive, negative numbers.
ints :: Stream Integer
ints = interleaveS nats negs

firstN :: Int -> Stream Integer -> [Integer]
firstN n (Cons x xs)
 | n < 0     = error "must be positive"
 | n == 0    = []
 | otherwise = x : firstN (n-1) xs

first5NegNumbers = firstN 5 negs

Lazy evaluation

Lazy evaluation = Call-by need (CBN)

Technique for implementing CBN more efficiently:

A redex is evaluated only if needed.
Sharing to avoid duplicating redexes.
Once evaluated, a redex is updated with the result to avoid evaluating it more than once.

As a result, under lazy evaluation, any one redex is evaluated at most once.

Lazy evaluation example

Recall

square x = x * x
inc x = x + 1
main = square (inc 1)

There is no need to evaluate square (inc 1) on its own.

3 + square (inc 1) => 3 + ((inc 1) * (inc 1))
                   => 3 + ((1 + 1) * (inc 1))
                   => 3 + (2 * (inc 1))
                   => 3 + (2 * 2)               -- Sharing
                   => 3 + 4
                   => 7

Lazy evaluation in a strict language

Requires manual effort.
Need to suspend a computation and resume the computation (manually).
We can use “lambdas” to suspend the computation, e.g. f = \x -> comp.
We “force” a computation via f ().

Lazy evaluation - space/memory leaks

msum [] = 0
msum (x:xs) = x + msum xs
main = do
 print (msum [1..10000000])

Try out the above

> ghc Sum.hs
> ./Sum
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

So possibly due to the recursive call the stack size is too small?

NOTE

There’s no longer a stack space overflow on today’s big machines. But the strict version (coming up) is considerably faster.

Lazy evaluation - space/memory leaks (2)

“Iterative” version

sumAcc acc [] = acc
sumAcc acc (x:xs) = sumAcc (acc+x) xs
main = do
 print (sumAcc 0 [1..10000000])

Try out the above

> ghc Sum.hs
> ./Sum
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

Oops again!
Haskell will translate msum and sumAcc to a similar C program which uses a for loop instead of a recursive call (“tail call elimination”)
So where’s the problem?

Lazy evaluation - space/memory leaks (3)

sumAcc acc [] = acc
sumAcc acc (x:xs) = sumAcc (acc+x) xs

sumAcc 0 [1,2,3,4] => sumAcc (0+1) [2,3,4]
                   => sumAcc (0+1+2) [3,4]
                   => sumAcc (0+1+2+3) [4]
                   => sumAcc (0+1+2+3+4) []
                   => 0+1+2+3+4

Aha!
The argument is evaluated lazily
Must force its evaluation by inserting a strictness annotation

sumAccStrict acc [] = acc
sumAccStrict acc (x:xs) =
  let y = acc + x
  in y `seq` sumAccStrict y xs

Complete source code

-- There's no longer a stack space overflow on today's big machines.
-- But the strict version is considerably faster.

xs = take (10^8) [1..] -- [1..100000000000]

msum [] = 0
msum (x:xs) = x + msum xs
-- main = print (msum xs)


sumAccStrict acc [] = acc
sumAccStrict acc (x:xs) =
  let y = acc + x
  in y `seq` sumAccStrict y xs

main = print $ sumAccStrict 0 xs

Lazy evaluation summary

The good
- Declarative code and reuse
- Infinite structures, …
The evil and ugly
- Space leaks, performance problems due to unevaluated “thunks”
  - See Space leak zoo
- Strictness analyis helps, see below where argument n is strict
- Often programmer must provide explicit strictness annotation to tell Haskell when an evaluation shall be enforced
Further reading: More points for lazy evaluation

Higher-order functions

Lambda abstractions
Currying
Partial function application
Continuation-passing style
From higher-order to first-order programs

Higher-order functions (lambda abstraction)

Recall:

Function definitions consist of a left-hand side and a right-hand side.
Typically of the form

f x1 ... xn = rightHandSideExpression

where x1, …, xn are the formal parameters. The above can also be written as follows

f = \x1-> ... \xn -> rightHandSideExpression

relying on the lambda abstraction syntax \x -> ... (see annonymous functions).

The following function definitions are equivalent

plus x y = x + y

plus2 = \x -> \y -> x + y

Higher-order functions (currying)

In imperative languages, we collectively supply the arguments to a function. In Haskell, we can achieve the same by passing as an argument a pair/tuple.

plus x y = x + y
plus2 = \x -> \y -> x + y
plus3 (x,y) = x + y
plus4 = \(x,y) -> x + y

Both styles are equivalent and its possible to transform a function with a pair argument into a function which expects a chain of arguments (and vice versa of course).

This transformation is known as Currying

The following transformation functions are part of the Prelude.

curry :: ((a, b) -> c) -> a -> b -> c
curry f = \x -> \y -> f (x,y)
uncurry :: (a -> b -> c) -> (a, b) -> c
uncurry f = \(x,y) -> f x y

Express the various “plus” functions in terms of each other using curry and uncurry.

Higher-order functions (partial application)

Functions expect a certain number of arguments
Functions are expressions.
So we can obtain new functions/expression by providing a partial set of a function’s arguments.
Consider the following list of examples

plus x y = x + y

inc x = plus 1 x

inc2 = plus 1

inc3 = (+1)

incList = map (+1)

What are the types of the above functions?

Continuation-passing style (CPS)

CPS is a particular style of writing functional programs
A program is in CPS if no function ever returns
- For Haskell this means, the only expression appearing on the right-hand side is another function call
Instead every function takes an extra argument (a function called “the continuation”) that it calls with the result
Next
- We first review direct style programs
- Then we consider CPS

Direct style example

Factorial function written in direct style (i.e. using “return”)

fac n
 | n <= 1 = 1
 | otherwise = n * fac (n-1)

Sample execution

fac 5 = 5 * fac 4
      = 5 * (4 * fac 3)
      = 5 * (4 * (3 * fac 2))
      = 5 * (4 * (3 * (2 * fac 1)))
      = 5 * (4 * (3 * (2 * 1)))
      = 5 * (4 * (3 * 2))
      = 5 * (4 * 6)
      = 5 * 24
      = 120

Variable n stored in the call stack.
Only once the call fac (n-1) returns, we can compute the result

Iterative version

facI n m
 | n <= 1    = m
 | otherwise = facI (n-1) (n * m)

No call stack required
Strictly continues
Accumulates result along the way
Sample evaluation (observe we’re applying strict evaluation)

facI 5 1 = facI 4 5
         = facI 3 20
         = facI 2 60
         = facI 1 120
         = 120

CPS version

facC n k
 | n <= 1 = k 1
 | otherwise = facC (n-1) (\v -> k (n * v))

fac n = facC n (\x -> x)

Generalization of the iterative version
Factorial takes as an extra argument a continuation k
Pass results to continuation k
Sample execution

facC 5 k = facC 4 (\v4 -> k (5 * v4))
         = facC 3 (\v3 -> (\v4 -> k (5 * v4)) (4 * v3))
         = facC 2 (\v2 -> (\v3 -> (\v4 -> k (5 * v4)) (4 * v3)) (3 * v2))
         = facC 1 (\v1 -> (\v2 -> (\v3 -> (\v4 -> k (5 * v4)) (4 * v3)) (3 * v2)) (2 * v1))
         = (\v1 -> (\v2 -> (\v3 -> (\v4 -> k (5 * v4)) (4 * v3)) (3 * v2)) (2 * v1)) 1
         = (\v2 -> (\v3 -> (\v4 -> k (5 * v4)) (4 * v3)) (3 * v2)) 2
         = (\v3 -> (\v4 -> k (5 * v4)) (4 * v3)) 6
         = (\v4 -> k (5 * v4)) 24
         = k 120

CPS summary

CPS not meant to be written by humans
Automatic transformation from direct style to CPS
CPS advantage
- No call-stack required
- Every call is a tail call
- Easy to suspend and resume a computation (think of multi-threading)
CPS applies to other languages, e.g. C, JavaScript, …

From higher-order to first-order

Ultimately, we must transform a higher-order language into some simpler language which can be executed by the computer hardware.

There are several methods:

Defunctionalization
- Whole program transformation to eliminate higher-order functions by replacing them via a single first-order apply function
Closures
- a code pointer, pointing to a fixed piece of code computing the function result;
- an environment: a record providing values for the free variables of the function

Defunctionalization

Defunctionalization is a global program transformation to turn higher-order programs into first-order order ones.

The basic idea is to

replace every lambda abstraction with its own data constructor that will carry what environment is needed, and
replace every application of the higher-order function with an apply function that will interpret the data structure.

Next, we will explain the basic idea via an example. In practice, we may need to apply additional steps, e.g. closure conversion (lambda lifting).

Defunctionalization Example

empty = \ x1 -> False
insert = \x2 -> \x3 -> \x4 -> (x2==x4) || (x3 x4)
res = insert 1 empty 2

translates to

apply :: Eq a1 => Arrow a1 a2 -> a1 -> a2
apply = \ f -> \ arg ->
  case f of
    X1 -> False
    X2 -> let x = arg in X3 x
    (X3 x2) -> let x3 = arg in X4 x2 x3
    (X4 x2 x3) -> let x4=arg in (x2==x4) || (apply x3 x4)

emptyDF = X1
insertDF = X2
resDF = apply (apply (apply insertDF 1) emptyDF) 2

where we assume

data Arrow :: * -> * -> * where
 X1 :: Arrow a1 Bool
 X2 :: Arrow a1 (Arrow (Arrow a1 Bool) (Arrow a1 Bool))
 X3 :: a -> Arrow (Arrow a Bool) (Arrow a Bool)
 X4 :: a1 -> Arrow a1 Bool -> Arrow a1 Bool

Defunctionalization Example (2)

apply :: Eq a1 => Arrow a1 a2 -> a1 -> a2
apply = \ f -> \ arg ->
  case f of
    X1 -> False
    X2 -> let x = arg in X3 x
    (X3 x2) -> let x3 = arg in X4 x2 x3
    (X4 x2 x3) -> let x4=arg in (x2==x4) || (apply x3 x4)

emptyDF = X1
insertDF = X2
resDF = apply (apply (apply insertDF 1) emptyDF) 2

Evaluate the defunctionalized program

apply (apply (apply insertDF 1) emptyDF) 2
-- apply insertDF 1 = X3 1
= apply (apply (X3 1) emptyDF) 2
-- apply (X3 1) emptyDF = X4 1 X1
= apply (X4 1 X1) 2
= 1 == 2 || (apply X1 2)
= apply X1 2
= False

Summary

Lazy evaluation (one = 1 : one)
Higher-order functions
- Lambda abstractions (\x -> ...)
- Currying (f x y versus f (x,y))
- Partial function application (map (+1))
- Continuation-passing style (never return)
- From higher-order to first-order programs

Haskell: Evaluation Order and Higher-order Functions