Lambda calculus

The lambda calculus represents a minimal core language just consisting of functions.

Functions are first-class (can appear as arguments and as return values). Think of first-class objects!

Lambda calculus: Syntax and semantics

Terms (expressions/programs) in the lambda calculus are defined by the following (abstract) EBNF syntax.

$t \ \ ::= \ \ x \ \ \mid \ \ \lambda x.t \ \ \mid \ \ t \ t$

where

$x$ refers to a variable
$\lambda x.t$ to a lambda abstraction
$t \ t$ refers to a function application

The above EBNF is ambiguous. For example, consider expression $\lambda x.x \ x$ . Based on the above EBNF rules there are two possible interpretations.

A lambda function with formal parameter $x$ where $x$ is applied to itself in the function body.
The identity function $\lambda x.x$ applied to $x$ .

The first interpretation can be explained via the following (parse tree) derivation $t \rightarrow \lambda x.t \rightarrow \lambda x. t \ t \rightarrow \lambda x. x \ t \rightarrow \lambda x.x \ x$

whereas for the second derivation we find $t \rightarrow t \ t \rightarrow (\lambda x. t) \ t \rightarrow (\lambda x.x) \ t \rightarrow (\lambda x.x) \ x$

In the above derivation, parantheses "(..)" are meta-symbols to highlight the difference among the two derivations.

In examples, we therefore may make use of parantheses to avoid ambiguities, for example $(\lambda x.x) \ x$ .

The operational semantics of the lambda calculus is defined via a single rule, commonly referred to as $\beta$ -reduction.

$(\lambda x. t_1) \ t_2 \rightarrow [t_2/x]t_1$

In $[t_2/x]t_1$ , we apply the substitution $[t_2/x]$ : In the body $t_1$ , replace each occurence of the formal parameter $x$ by the argument $t_2$ .

Commonly, we refer to $\lambda x.e$ as a lambda-abstraction and to $\lambda x$ as the lambda-binding.

For example, $(\lambda x. (\lambda y. y)) \ z$ reduces to $\lambda y.y$ .

As it is common in programming languages, we can reuse the same variable names and apply the common rule that each use of a variable name refers to the closest binding site. For example, consider $\lambda x. (\lambda x. x)$ where the innermost occurence of $x$ refers to the inner lambda-binding.

Via some simply renaming we can always guarantee that variables are distinct. For example, $\lambda x. (\lambda x. x)$ can be renamed to $\lambda x. (\lambda y. y)$ .

We generally assume that lambda-bindings always introduce fresh, distinct variables.

Details

Syntactic sugar

Function application is left associative. Hence, the term $(\lambda u.u) \ z \ x$ is a short-hand for $(((\lambda u.u) \ z) \ x)$ .

The body of a lambda abstractions extends to the right as long as possible. Hence, the term $\lambda z. (\lambda u.u) \ z \ x$ is a short-hand for $\lambda z. (((\lambda u.u) \ z) \ x)$ .

Free variables

$fv(t)$ computes all free variables in $t$ , i.e. those variables which are not bound by a lambda abstraction.

The definition by induction over the structure of $t$ is as follows:

$fv(x) = \{ x \}$
$fv(\lambda x.t) = fv(t) - \{ x \}$
$fv(t_1 \ t_2) = fv(t_1) \cup fv(t_2)$

In case of name clashes we must rename bound variables. Consider $[y \ (\lambda x.x)/x] \lambda y.(\lambda x.x) \ y \ x$ :

Renaming yields: $[y \ (\lambda v.v)/x] \lambda z. (\lambda u.u) \ z \ x$
Substitution without name clashes yields: $\lambda z. (\lambda u.u) \ z(y \ (\lambda v.v))$

Substitution

Examples (where $e$ refers to a term/expression)

$[ e_1/x ] x = e_1$
$[ e_1/x ] y = y$ if $y$ != $x$
$[ e_1 / x ] e_2 \ e_3 = ([ e_1 /x ] e_2) \ ([ e_1 /x ] e_3)$
$[ e_1 / x ] \lambda y.e_2 = \lambda y. [ e_1 /x ] e_2$ if $y$ != $x$ and $y\not\in fv(e_1)$

Evaluation strategies

There are two principles ways how to evaluate terms:

Innermost first
Outermost first

Innermost

Look for a redex innermost, leftmost
Also called Applicative Order Reduction (AOR)
Redex = reducible expression of the form (\x-> ...) e

Call-by value (CBV) = AOR with the exception that we don't evaluate under "lambda" (i.e. inside function bodies)

Outermost

Outermost, leftmost, also called Normal Order Reduction (NOR)

Call-by name (CBN) = NOR with the exception that we don't evaluate under "lambda" (i.e. inside function bodies)

Examples

We consider evaluation of $(\lambda f.\lambda x. f \ (f \ x)) ((\lambda x.x) (\lambda x.x))$ where $\rightarrow$ denotes an evaluation step

Call-by-name (CBN)

$(\lambda f.\lambda x. f \ (f \ x)) ((\lambda x.x) (\lambda x.x))$

$\rightarrow \lambda x.(\lambda x.x)(\lambda x.x)((\lambda x.x)(\lambda x.x)x)$

$\rightarrow \lambda x.(\lambda x.x)((\lambda x.x)x)$

$\rightarrow \lambda x.(\lambda x.x)x$

$\rightarrow \lambda x.x$

Call-by-value (CBV):

$(\lambda f.\lambda x. f \ (f \ x)) ((\lambda x.x) (\lambda x.x))$

$\rightarrow (\lambda f.\lambda x. f \ (f \ x)) (\lambda x.x)$

$\rightarrow \lambda x.(\lambda x.x) ((\lambda x.x) x)$

$\rightarrow \lambda x.(\lambda x.x)x$

$\rightarrow \lambda x.x$

(Non)Termination

There are lambda terms whose evaluation will not terminate:

$(\lambda x. (x \ x)) (\lambda x. (x \ x))$

$\rightarrow (\lambda x. (x \ x)) (\lambda x. (x \ x))$

$\rightarrow ...$

Termination under CBN is more likely than termination under CBV.

Let's abbreviate $(\lambda x. (x \ x)) (\lambda x. (x \ x))$ by $\Omega$ . Then, the lambda term $(\lambda x. \lambda y. x) \Omega$ terminates under CBN but not under CBV.

Short summary

CBN seems rather inefficient
But has plenty of applications
- More code reuse
- More declarative code
- Infinite, circular data structures
- ...

More details in some upcoming exercises.

Further examples

Consider $(\lambda x. x \ y) \ (\lambda x. x) \ (\lambda x. x)$ .

Function application is left-associative, therefore, the meaning of the above lambda term is $((\lambda x. x \ y) \ (\lambda x. x)) \ (\lambda x. x)$ .

Let's carry out the evaluation steps.

$((\lambda x. x \ y) \ (\lambda x. x)) \ (\lambda x. x)$

$=_{Renaming} \ \ ((\lambda x_1. x_1 \ y) \ (\lambda x_2. x_2)) \ (\lambda x_3. x_3)$

$\rightarrow \ \ ( (\lambda x_2. x_2) \ y) (\lambda x_3. x_3)$ because $(\lambda x_1. x_1 \ y) \ (\lambda x_2. x_2) \ \rightarrow \ [(\lambda x_2. x_2) / x_1 ] (x_1 \ y) \ = \ (\lambda x_2. x_2) \ y$

$\rightarrow \ \ y \ (\lambda x_3. x_3)$ because $(\lambda x_2. x_2) \ y \ \rightarrow \ [y / x_2] x_2 \ = \ y$

On the other hand, the lambda term $(\lambda x. x \ y) \ ((\lambda x. x) \ (\lambda x. x))$ evaluates to $y$ . Briefly, the evaluation steps are

$(\lambda x. x \ y) \ ((\lambda x. x) \ (\lambda x. x))$

$-_{Renaming} \ (\lambda x_1. x_1 \ y) \ ((\lambda x_2. x_2) \ (\lambda x_3. x_3))$

$\rightarrow \ (\lambda x_1. x_1 \ y) \ (\lambda x_3. x_3)$

$\rightarrow \ (\lambda x_3. x_3) \ y$

$\rightarrow \ y$

Expressiveness/Church Encoding

How expressive is the lambda calculus? As expressive as a Turing machine?

Yes! But how? The lambda calculus looks rather simple.

Indeed, but as we will show next, we can encode data types (booleans, ...), conditional expressions, recursion.

The idea of the Church encoding (named after Alonzo Church):

Encode behavior not structure

Boolean operations

The idea:

not x = if x then false else true
x or y = if x then true else y
x and y = if x then y else false

The actual encoding:

true = $\lambda x. \lambda y.x$
false = $\lambda x. \lambda y. y$
if $e_1$ then $e_2$ else $e_3$ = $e_1 \ e_2 \ e_3$

which written in 'lambda notation' is

ite = $\lambda e_1. \lambda e_2. \lambda e_3. e_1 \ e_2 \ e_3$

Example: if true then $e_2$ else $e_3$ equals $(\lambda x. \lambda y. x) \ e_2 \ e_3$

$(\lambda x. \lambda y. x) \ e_2 \ e_3$

$\rightarrow e_2$

In detail:

$(\lambda x. \lambda y. x) \ e_2 \ e_3$

= $((\lambda x. \lambda y. x) \ e_2) \ e_3$

by adding parentheses (recall that function application is left associative)

$\rightarrow (\lambda y.e_2) \ e_3$

$\rightarrow e_2$

Boolean operations example

We evaluate $ite \ true \ x \ y$ . The expect result is $x$ . Let's see!

Replace short-hands by actual definitions.

$ite \ true \ x \ y = (\lambda x. \lambda y. \lambda z. x \ y \ z) \ (\lambda x. \lambda y. x) \ x \ y$

Rename variables such that all bound variables are distinct

$(\lambda x_1. \lambda y_1. \lambda z_1. x_1 \ y_1 \ z_1) \ (\lambda x_2. \lambda y_2. x_2) \ x \ y$

Introduce parantheses

$(((\lambda x_1. \lambda y_1. \lambda z_1. (x_1 \ y_1) \ z_1) \ (\lambda x_2. \lambda y_2. x_2)) \ x) \ y$

Carry out evaluation steps

$(((\lambda x_1. \lambda y_1. \lambda z_1. (x_1 \ y_1) \ z_1) \ (\lambda x_2. \lambda y_2. x_2)) \ x) \ y$

$\rightarrow (([(\lambda x_2. \lambda y_2. x_2) / x_1] (\lambda y_1. \lambda z_1. (x_1 \ y_1) \ z_1)) \ x) \ y$

$= (((\lambda y_1. \lambda z_1. ((\lambda x_2. \lambda y_2. x_2) \ y_1) \ z_1)) \ x) \ y$

Several redexes. We choose the outermost redex

$\rightarrow ([x / y_1] (\lambda z_1. ((\lambda x_2. \lambda y_2. x_2) \ y_1) \ z_1)) \ y$

$= (\lambda z_1. ((\lambda x_2. \lambda y_2. x_2) \ x) \ z_1) \ y$

$\rightarrow [y / z_1] (((\lambda x_2. \lambda y_2. x_2) \ x) \ z_1)$

$= ((\lambda x_2. \lambda y_2. x_2) \ x) \ y$

$\rightarrow ([x / x_2] (\lambda y_2. x_2)) \ y$

$= (\lambda y_2. x) \ y$

$\rightarrow [y / y_2] x$

$= x$

Recursion

What about recursion?

     factorial n = if n == 0 then 1
                   else n * (factorial (n-1))

Idea:

Fixpointcombinator
- $Y = \lambda F.(\lambda y.F \ (y \ y)) \ (\lambda x.F \ (x \ x))$
- We find that $Y \ F = F \ (Y F)$
$Fac = \lambda fac. \lambda n. if \ (n==0) \ \ then \ 1 \ else \ n*(fac \ (n-1))$
$factorial = Y \ Fac$

This works!

$factorial \ 1$

$\rightarrow (def) \ (Y \ Fac) \ 1$

$\rightarrow (fix) \ (Fac \ (Y \ Fac)) \ 1$

$\rightarrow (def) \ ((\lambda fac. \lambda n. if \ (n == 0) \ then \ 1 \ else \ n*(fac (n-1)))) (Y Fac)) \ 1$

$\rightarrow \ (\lambda n. if \ (n==0) \ then \ 1 \ else \ n*((Y \ Fac) \ (n-1))) \ 1$

$\rightarrow \ 1 * ((Y \ Fac) \ 0)$

$\rightarrow (fix) \ 1 * ((Fac \ (Y \ Fac)) \ 0)$

$\rightarrow \ 1 * ((\lambda n. if \ (n == 0) \ then \ 1 \ else \ n*((Y \ Fac) \ (n-1))) \ 0)$

$\rightarrow \ 1 * 1$

$\rightarrow \ 1$

More encodings (pairs, natural numbers) are possible. See Church encoding.