Martin Sulzmann
Need a machine-independent description of the meaning of programs. We distinguish between syntax and semantics.
The valid syntax of programs described in terms of EBNF. Not every syntactically correct program is meaningful.
We distinguish among the static and dynamic semantics of programs.
Static = Compile-time.
Dynamic = Run-time.
Static semantics = Describe the meaning of programs without actually evaluating the program (at compile-time).
Proper use of variables.
Types are compatible etc.
Dynamic semantics = Describe the run-time meaning of programs. Ignore low-level details (register allocation, etc). Sort of a reference implementation.
Various styles of formalization. Interpreter-style etc.
Predict the run-time behavior of a program.
Types classify values.
Many applications:
Compiler-optimizations
Reject invalid programs as early as possible
...
We consider Integer and Boolean expressions of the following form where we make use of the EBNF notation to describe the syntax of expressions.
N ::= ... | -1 | 0 | 1 | ...
B ::= true | false
E ::= N | B | (E) | E + E | E * E | E && E | E || E
We give an interpreter-style semantics (also know as big-step semantics) where we evaluate left and right operands to obtain the resulting value. Values are either Integer or Boolean constants
V ::= N | B
We give a formal specification of the interpreter in terms of evaluation rules of the form
E => V
The above rule reads: Expression E evaluates to the value V.
For each pattern of an expression as described by the above syntax, we define an evaluation rule.
N => N
B => B
Integer and Boolean constants evaluate to itself.
Consider addition.
We write E1+E2 to denote an expression that involves addition where E1 refers to the left operand and E2 refers to the right operand.
Evalution of E1+E2 is specified as follows.
E1 => n1
E2 => n2
n = n1 + n2
-----------------
E1 + E2 => n
Above to be read as "if-then" rule. The part above the --- is the precondition (premise) and the part below is the postcondition (conclusion).
If
E1 evaluates to some Integer n1
E2 evaluates to some Integer n2
n is the result of suming up n1 and n2
then
E1+E2 evaluates to nConsider an example where we evaluate (1+2)+(2+3).
1 => 1 2 => 2
2 => 2 3 => 3
----------- -----------
1+2 => 3 2+3 => 5
-------------------------------------
(1+2)+(2+3) => 8
We obtain an evaluation tree. Evaluation rules are syntax-directed. We traverse the structure of expressions and apply the respective rules. Leaf nodes correspond to evaluation of constants and intermediate nodes correspond to evaluation of compound expressions.
Here are some of the further evaluation rules.
E1 => false
-----------------
E1 && E2 => false
E1 => true
E2 => false
-----------------
E1 && E2 => false
E1 => true
E2 => true
-----------------
E1 && E2 => true
Points to note:
We apply short-circuit evaluation.
Evaluation may get stuck in case of ill-typed expressions.
Consider false && 0. Immediately evaluates to false.
Consider true && 0. Evaluation gets stuck!
We wish to distinguish between well-typed and ill-typed expressions. For this purpose, we employ typing rules. Typing rules are syntax-directed but unlike evaluation rules there is no need to execute programs. We approximate the run-time behavior of programs in terms of types.
In our case, we find the following language of types
T ::= Int | Bool
We write
E |- T
to denote that expression E is of type T.
For each pattern of an expression as described by the above syntax, we define a typing rule.
N |- Int
B |- Bool
E1 |- Int
E2 |- Int
-----------
E1+E2 |- Int
E1 |- Bool
E2 |- Bool
-----------
E1&&E2 |- Bool
and so on.
We implement evaluation and typing of our language of Integer and Boolean expressions.
package main
import "fmt"
import "strconv"
// Simple expression language
// Interface
type Exp interface {
pretty() string
eval() Val
infer() Type
}
// Values
type Kind int
const (
ValueInt Kind = 0
ValueBool Kind = 1
Undefined Kind = 2
)
type Val struct {
flag Kind
valI int
valB bool
}
func mkInt(x int) Val {
return Val{flag: ValueInt, valI: x}
}
func mkBool(x bool) Val {
return Val{flag: ValueBool, valB: x}
}
func mkUndefined() Val {
return Val{flag: Undefined}
}
func showVal(v Val) string {
var s string
switch {
case v.flag == ValueInt:
s = Num(v.valI).pretty()
case v.flag == ValueBool:
s = Bool(v.valB).pretty()
case v.flag == Undefined:
s = "Undefined"
}
return s
}
// Types
type Type int
const (
TyIllTyped Type = 0
TyInt Type = 1
TyBool Type = 2
)
func showType(t Type) string {
var s string
switch {
case t == TyInt:
s = "Int"
case t == TyBool:
s = "Bool"
case t == TyIllTyped:
s = "Illtyped"
}
return s
}
// Cases
type Bool bool
type Num int
type Mult [2]Exp
type Plus [2]Exp
type And [2]Exp
type Or [2]Exp
// pretty print
func (x Bool) pretty() string {
if x {
return "true"
} else {
return "false"
}
}
func (x Num) pretty() string {
return strconv.Itoa(int(x))
}
func (e Mult) pretty() string {
var x string
x = "("
x += e[0].pretty()
x += "*"
x += e[1].pretty()
x += ")"
return x
}
func (e Plus) pretty() string {
var x string
x = "("
x += e[0].pretty()
x += "+"
x += e[1].pretty()
x += ")"
return x
}
func (e And) pretty() string {
var x string
x = "("
x += e[0].pretty()
x += "&&"
x += e[1].pretty()
x += ")"
return x
}
func (e Or) pretty() string {
var x string
x = "("
x += e[0].pretty()
x += "||"
x += e[1].pretty()
x += ")"
return x
}
// Evaluator
func (x Bool) eval() Val {
return mkBool((bool)(x))
}
func (x Num) eval() Val {
return mkInt((int)(x))
}
func (e Mult) eval() Val {
n1 := e[0].eval()
n2 := e[1].eval()
if n1.flag == ValueInt && n2.flag == ValueInt {
return mkInt(n1.valI * n2.valI)
}
return mkUndefined()
}
func (e Plus) eval() Val {
n1 := e[0].eval()
n2 := e[1].eval()
if n1.flag == ValueInt && n2.flag == ValueInt {
return mkInt(n1.valI + n2.valI)
}
return mkUndefined()
}
func (e And) eval() Val {
b1 := e[0].eval()
b2 := e[1].eval()
switch {
case b1.flag == ValueBool && b1.valB == false:
return mkBool(false)
case b1.flag == ValueBool && b2.flag == ValueBool:
return mkBool(b1.valB && b2.valB)
}
return mkUndefined()
}
func (e Or) eval() Val {
b1 := e[0].eval()
b2 := e[1].eval()
switch {
case b1.flag == ValueBool && b1.valB == true:
return mkBool(true)
case b1.flag == ValueBool && b2.flag == ValueBool:
return mkBool(b1.valB || b2.valB)
}
return mkUndefined()
}
// Type inferencer/checker
func (x Bool) infer() Type {
return TyBool
}
func (x Num) infer() Type {
return TyInt
}
func (e Mult) infer() Type {
t1 := e[0].infer()
t2 := e[1].infer()
if t1 == TyInt && t2 == TyInt {
return TyInt
}
return TyIllTyped
}
func (e Plus) infer() Type {
t1 := e[0].infer()
t2 := e[1].infer()
if t1 == TyInt && t2 == TyInt {
return TyInt
}
return TyIllTyped
}
func (e And) infer() Type {
t1 := e[0].infer()
t2 := e[1].infer()
if t1 == TyBool && t2 == TyBool {
return TyBool
}
return TyIllTyped
}
func (e Or) infer() Type {
t1 := e[0].infer()
t2 := e[1].infer()
if t1 == TyBool && t2 == TyBool {
return TyBool
}
return TyIllTyped
}
// Helper functions to build ASTs by hand
func number(x int) Exp {
return Num(x)
}
func boolean(x bool) Exp {
return Bool(x)
}
func plus(x, y Exp) Exp {
return (Plus)([2]Exp{x, y})
// The type Plus is defined as the two element array consisting of Exp elements.
// Plus and [2]Exp are isomorphic but different types.
// We first build the AST value [2]Exp{x,y}.
// Then cast this value (of type [2]Exp) into a value of type Plus.
}
func mult(x, y Exp) Exp {
return (Mult)([2]Exp{x, y})
}
func and(x, y Exp) Exp {
return (And)([2]Exp{x, y})
}
func or(x, y Exp) Exp {
return (Or)([2]Exp{x, y})
}
// Examples
func run(e Exp) {
fmt.Printf("\n ******* ")
fmt.Printf("\n %s", e.pretty())
fmt.Printf("\n %s", showVal(e.eval()))
fmt.Printf("\n %s", showType(e.infer()))
}
func ex1() {
ast := plus(mult(number(1), number(2)), number(0))
run(ast)
}
func ex2() {
ast := and(boolean(false), number(0))
run(ast)
}
func ex3() {
ast := or(boolean(false), number(0))
run(ast)
}
func main() {
fmt.Printf("\n")
ex1()
ex2()
ex3()
}e ::= x | \x->e | e e AST of Lambda expressions
Implicitly typed.
We describe the dynamic semantics in terms of a big-step operational semantics.
We write e1 => e2 to denote that the lambda term e1 evaluates to e2.
x => x
\x -> e => \x -> e
e1 => \x -> e e2 => e3 [x |-> e3]e => e4
--------------------------------------------------
e1 e2 => e4
e ::= x | \x->e | e e AST of Lambda expressions
t ::= t -> t Function types
| Int Primitives
| Bool
G ::= G cup G | { x:t } Variable environment
(Var) (x:t) in G
------------
G |- x : t
(App) G |- e1 : t1 -> t2 G |- e2 : t1
--------------------------------
G |- e1 e2 : t2
(Abs) G cup { x: t1 } |- e : t2
-------------------------
G |- \x->e : t1 -> t2
G |- e : t states that under the environment G, the expression e has type t.
The dashed line separates the premise from the conclusion. If we can establish the premise, the conclusion follows.
The operator cup stands for set union.
G may contain primitives such as plus : Int -> Int -> Int etc.
Well-typed expression are guaranteed not to go wrong.
That is, there is no exceptional behavior besides cases such as div 1 0.
Such a guarantee does not apply to for example C. Why?
Well, in C and other languages we can easily bend the typing rules via type casts (either explicitly or implicitly)
Typing example \y -> \x -> x y
Here is the typing tree. It looks like an (up-side down) tree with the conclusion as the root at the bottom. Nodes are premises of typing rules. Leaves correspond to the (Var) base case.
(Var) (x : Int->Bool) in {y:Int,x:Int->Bool} (Var) (y:Int) in {y:Int,x:Int->Bool}
------------------------------------- --------------------------------
(App) {y:Int,x:Int->Bool} |- x : Int->Bool {y:Int,x:Int->Bool} |- y : Int
------------------------------------
(Abs) {y:Int,x:Int->Bool} |- x y : Bool
------------------------------------
(Abs) {y: Int} |- \x -> x y : (Int->Bool)->Bool
-----------------------------
{} |- \y -> \x -> x y : Int->((Int->Bool)->Bool)
Other typings possible?
Further example. What about {} |- \y -> \x -> y x : Int -> Int ?
Most general type? See parametric polymorphism
e ::= x | \x:t->e | e e
(Abs-Exp) G cup { x: t1 } |- e : t2
-------------------------
G |- \x:t1->e : t1 -> t2
If {} |- e : t and e => e' then {} |- e' : t
{} |- \x -> x : t -> t
where t = Int
t = Bool
t = Int -> Int
...
Let e be an explictely typed lambda expression.
Suppose we have that G |- e : t1 and G |- e : t2.
Then, t1 = t2.