Programming language semantics: Dynamic versus static semantics

Motivation

Need a machine-independent description of the meaning of programs. We distinguish between syntax and semantics.

The valid syntax of programs described in terms of EBNF. Not every syntactically correct program is meaningful.

We distinguish among the static and dynamic semantics of programs.

Static = Compile-time.

Dynamic = Run-time.

Static semantics = Describe the meaning of programs without actually evaluating the program (at compile-time).

Proper use of variables.
Types are compatible etc.

Dynamic semantics = Describe the run-time meaning of programs. Ignore low-level details (register allocation, etc). Sort of a reference implementation.

Execution rules to specify dynamic behavior

Various styles of formalization. Interpreter-style etc.

Typing rules to specify the static behavior

Predict the run-time behavior of a program.

Types classify values.

Many applications:

Compiler-optimizations
Reject invalid programs as early as possible
...

Simple expression language

We consider Integer and Boolean expressions of the following form where we make use of the EBNF notation to describe the syntax of expressions.

 N ::=  ... | -1 | 0 | 1 | ...

 B ::= true | false

 E ::= N | B | (E) | E + E | E * E | E && E | E || E

Evaluation rules

We give an interpreter-style semantics (also know as big-step semantics) where we evaluate left and right operands to obtain the resulting value. Values are either Integer or Boolean constants

 V ::= N | B

We give a formal specification of the interpreter in terms of evaluation rules of the form

 E => V

The above rule reads: Expression E evaluates to the value V.

For each pattern of an expression as described by the above syntax, we define an evaluation rule.

Integer and Boolean constants

 N => N

 B => B

Integer and Boolean constants evaluate to itself.

Compound expressions

Consider addition.

We write E1+E2 to denote an expression that involves addition where E1 refers to the left operand and E2 refers to the right operand.

Evalution of E1+E2 is specified as follows.

     E1 => n1
     E2 => n2
     n = n1 + n2
  -----------------
     E1 + E2 => n

Above to be read as "if-then" rule. The part above the --- is the precondition (premise) and the part below is the postcondition (conclusion).

E1 evaluates to some Integer n1
E2 evaluates to some Integer n2
n is the result of suming up n1 and n2

then

E1+E2 evaluates to n

Consider an example where we evaluate (1+2)+(2+3).


  1 => 1             2 => 2
  2 => 2             3 => 3
  -----------        -----------
  1+2 => 3           2+3 => 5

  -------------------------------------
  (1+2)+(2+3) => 8

We obtain an evaluation tree. Evaluation rules are syntax-directed. We traverse the structure of expressions and apply the respective rules. Leaf nodes correspond to evaluation of constants and intermediate nodes correspond to evaluation of compound expressions.

Here are some of the further evaluation rules.


   E1 => false
   -----------------
   E1 && E2 => false

   E1 => true
   E2 => false
   -----------------
   E1 && E2 => false

   E1 => true
   E2 => true
   -----------------
   E1 && E2 => true

Points to note:

We apply short-circuit evaluation.
Evaluation may get stuck in case of ill-typed expressions.

Consider false && 0. Immediately evaluates to false.

Consider true && 0. Evaluation gets stuck!

Typing rules

We wish to distinguish between well-typed and ill-typed expressions. For this purpose, we employ typing rules. Typing rules are syntax-directed but unlike evaluation rules there is no need to execute programs. We approximate the run-time behavior of programs in terms of types.

In our case, we find the following language of types

T ::= Int | Bool

We write

  E |- T

to denote that expression E is of type T.

For each pattern of an expression as described by the above syntax, we define a typing rule.

N |- Int

B |- Bool

E1 |- Int
E2 |- Int
-----------
E1+E2 |- Int

E1 |- Bool
E2 |- Bool
-----------
E1&&E2 |- Bool

and so on.

Semantics of Exp

We implement evaluation and typing of our language of Integer and Boolean expressions.

package main

import "fmt"
import "strconv"

// Simple expression language

// Interface

type Exp interface {
    pretty() string
    eval() Val
    infer() Type
}

// Values

type Kind int

const (
    ValueInt  Kind = 0
    ValueBool Kind = 1
    Undefined Kind = 2
)

type Val struct {
    flag Kind
    valI int
    valB bool
}

func mkInt(x int) Val {
    return Val{flag: ValueInt, valI: x}
}
func mkBool(x bool) Val {
    return Val{flag: ValueBool, valB: x}
}
func mkUndefined() Val {
    return Val{flag: Undefined}
}

func showVal(v Val) string {
    var s string
    switch {
    case v.flag == ValueInt:
        s = Num(v.valI).pretty()
    case v.flag == ValueBool:
        s = Bool(v.valB).pretty()
    case v.flag == Undefined:
        s = "Undefined"
    }
    return s
}

// Types

type Type int

const (
    TyIllTyped Type = 0
    TyInt      Type = 1
    TyBool     Type = 2
)

func showType(t Type) string {
    var s string
    switch {
    case t == TyInt:
        s = "Int"
    case t == TyBool:
        s = "Bool"
    case t == TyIllTyped:
        s = "Illtyped"
    }
    return s
}

// Cases

type Bool bool
type Num int
type Mult [2]Exp
type Plus [2]Exp
type And [2]Exp
type Or [2]Exp

// pretty print

func (x Bool) pretty() string {
    if x {
        return "true"
    } else {
        return "false"
    }

}

func (x Num) pretty() string {
    return strconv.Itoa(int(x))
}

func (e Mult) pretty() string {

    var x string
    x = "("
    x += e[0].pretty()
    x += "*"
    x += e[1].pretty()
    x += ")"

    return x
}

func (e Plus) pretty() string {

    var x string
    x = "("
    x += e[0].pretty()
    x += "+"
    x += e[1].pretty()
    x += ")"

    return x
}

func (e And) pretty() string {

    var x string
    x = "("
    x += e[0].pretty()
    x += "&&"
    x += e[1].pretty()
    x += ")"

    return x
}

func (e Or) pretty() string {

    var x string
    x = "("
    x += e[0].pretty()
    x += "||"
    x += e[1].pretty()
    x += ")"

    return x
}

// Evaluator

func (x Bool) eval() Val {
    return mkBool((bool)(x))
}

func (x Num) eval() Val {
    return mkInt((int)(x))
}

func (e Mult) eval() Val {
    n1 := e[0].eval()
    n2 := e[1].eval()
    if n1.flag == ValueInt && n2.flag == ValueInt {
        return mkInt(n1.valI * n2.valI)
    }
    return mkUndefined()
}

func (e Plus) eval() Val {
    n1 := e[0].eval()
    n2 := e[1].eval()
    if n1.flag == ValueInt && n2.flag == ValueInt {
        return mkInt(n1.valI + n2.valI)
    }
    return mkUndefined()
}

func (e And) eval() Val {
    b1 := e[0].eval()
    b2 := e[1].eval()
    switch {
    case b1.flag == ValueBool && b1.valB == false:
        return mkBool(false)
    case b1.flag == ValueBool && b2.flag == ValueBool:
        return mkBool(b1.valB && b2.valB)
    }
    return mkUndefined()
}

func (e Or) eval() Val {
    b1 := e[0].eval()
    b2 := e[1].eval()
    switch {
    case b1.flag == ValueBool && b1.valB == true:
        return mkBool(true)
    case b1.flag == ValueBool && b2.flag == ValueBool:
        return mkBool(b1.valB || b2.valB)
    }
    return mkUndefined()
}

// Type inferencer/checker

func (x Bool) infer() Type {
    return TyBool
}

func (x Num) infer() Type {
    return TyInt
}

func (e Mult) infer() Type {
    t1 := e[0].infer()
    t2 := e[1].infer()
    if t1 == TyInt && t2 == TyInt {
        return TyInt
    }
    return TyIllTyped
}

func (e Plus) infer() Type {
    t1 := e[0].infer()
    t2 := e[1].infer()
    if t1 == TyInt && t2 == TyInt {
        return TyInt
    }
    return TyIllTyped
}

func (e And) infer() Type {
    t1 := e[0].infer()
    t2 := e[1].infer()
    if t1 == TyBool && t2 == TyBool {
        return TyBool
    }
    return TyIllTyped
}

func (e Or) infer() Type {
    t1 := e[0].infer()
    t2 := e[1].infer()
    if t1 == TyBool && t2 == TyBool {
        return TyBool
    }
    return TyIllTyped
}

// Helper functions to build ASTs by hand

func number(x int) Exp {
    return Num(x)
}

func boolean(x bool) Exp {
    return Bool(x)
}

func plus(x, y Exp) Exp {
    return (Plus)([2]Exp{x, y})

    // The type Plus is defined as the two element array consisting of Exp elements.
    // Plus and [2]Exp are isomorphic but different types.
    // We first build the AST value [2]Exp{x,y}.
    // Then cast this value (of type [2]Exp) into a value of type Plus.

}

func mult(x, y Exp) Exp {
    return (Mult)([2]Exp{x, y})
}

func and(x, y Exp) Exp {
    return (And)([2]Exp{x, y})
}

func or(x, y Exp) Exp {
    return (Or)([2]Exp{x, y})
}

// Examples

func run(e Exp) {
    fmt.Printf("\n ******* ")
    fmt.Printf("\n %s", e.pretty())
    fmt.Printf("\n %s", showVal(e.eval()))
    fmt.Printf("\n %s", showType(e.infer()))
}

func ex1() {
    ast := plus(mult(number(1), number(2)), number(0))

    run(ast)
}

func ex2() {
    ast := and(boolean(false), number(0))
    run(ast)
}

func ex3() {
    ast := or(boolean(false), number(0))
    run(ast)
}

func main() {

    fmt.Printf("\n")

    ex1()
    ex2()
    ex3()
}

Simply-Typed Lambda Calculus

e ::= x  | \x->e | e e     AST of Lambda expressions

Implicitly typed.

Dynamic semantics

We describe the dynamic semantics in terms of a big-step operational semantics.

We write e1 => e2 to denote that the lambda term e1 evaluates to e2.

x => x

\x -> e => \x -> e


e1 => \x -> e      e2 => e3     [x |-> e3]e => e4
--------------------------------------------------
e1 e2 => e4

Typing rules (Static semantics)

 e ::= x  | \x->e | e e     AST of Lambda expressions
 t ::= t -> t               Function types
    |  Int                  Primitives
    |  Bool
 G ::= G cup G | { x:t }    Variable environment


(Var)   (x:t) in G
       ------------
        G |- x : t

(App)   G |- e1 : t1 -> t2  G |- e2 : t1
        --------------------------------
        G |- e1 e2 : t2

(Abs)   G cup { x: t1 } |- e : t2
        -------------------------
        G |- \x->e : t1 -> t2

G |- e : t states that under the environment G, the expression e has type t.
The dashed line separates the premise from the conclusion. If we can establish the premise, the conclusion follows.
The operator cup stands for set union.
Technicality, we assume that each lambda abstraction introduces a distinct variable.
- Thus, we avoid name clashes in the environment.
- The alternative would be to "overwrite" earlier definitions but it's simpler to assume distinct lambda variables and use a simple set for the environment.
G may contain primitives such as plus : Int -> Int -> Int etc.
Well-typed expression are guaranteed not to go wrong.
That is, there is no exceptional behavior besides cases such as div 1 0.
Such a guarantee does not apply to for example C. Why?
Well, in C and other languages we can easily bend the typing rules via type casts (either explicitly or implicitly)
Typing example \y -> \x -> x y

Here is the typing tree. It looks like an (up-side down) tree with the conclusion as the root at the bottom. Nodes are premises of typing rules. Leaves correspond to the (Var) base case.


   (Var)  (x : Int->Bool) in {y:Int,x:Int->Bool}    (Var) (y:Int) in {y:Int,x:Int->Bool}
         -------------------------------------       --------------------------------
  (App)  {y:Int,x:Int->Bool} |- x : Int->Bool        {y:Int,x:Int->Bool} |- y : Int
        ------------------------------------
 (Abs)  {y:Int,x:Int->Bool} |- x y : Bool
        ------------------------------------
 (Abs)  {y: Int} |- \x -> x y : (Int->Bool)->Bool
       -----------------------------
       {} |- \y -> \x -> x y : Int->((Int->Bool)->Bool)

Other typings possible?
Further example. What about {} |- \y -> \x -> y x : Int -> Int ?
Most general type? See parametric polymorphism

Explicitly typed variant

e ::= x  | \x:t->e | e e

(Abs-Exp)   G cup { x: t1 } |- e : t2
            -------------------------
            G |- \x:t1->e : t1 -> t2

Results

Type preservation:

If {} |- e : t  and e => e'  then {} |- e' : t

Infinitely many types:

{} |- \x -> x : t -> t

where t = Int
      t = Bool
      t = Int -> Int
      ...

Unique types:

Let e be an explictely typed lambda expression.
Suppose we have that G |- e : t1 and G |- e : t2.
Then, t1 = t2.