Martin Sulzmann
Rust the better C?
Statically typed/Simple form of type inference
Higher-order functions
Data types and pattern matching
User-controllable overloading
Deterministic memory management
Stack allocated by default
Heap allocations must be be explicit
Static ownership type system to guarantee there are no memory errors
fn inc2(x : i32) -> i32 {
let y : i32 = x; // Local variables
return y+1;
}
fn inc3(x : i32) -> i32 {
let y = x; // Type inference for local variables
return y+1;
}
fn inc4(x : i32) -> i32 {
let y = x; // Variables are immutable by default
y = y + 1; // Yields compiler error "cannot assign twice to immutable variable `y`"
return y;
}
fn inc5(x : i32) -> i32 {
let mut y = x;
y = y + 1; // Mutable variables must be declared explicitly
return y;
}
fn lambda1() {
let one = 1;
let inc = |x : i32| -> i32 { let y = x+one; return y; };
println!("\n{}",inc(1));
}
fn lambda2() {
let add = |x : i32, y : i32| -> i32 { return x + y; };
println!("\n{}",add(1,2));
// Multiple arguments. Type inference for return types.
let plus = |x : i32| { return move |y : i32| { return x + y; }; };
println!("\n{}",plus(1)(2));
}
Outer lambda “owns” x
Inner lambda “borrows” x
May call inner lambda multiple times (partial function application)
Can lead to “tricky” errors, see lambda1b
Rust has call-by-value semantics like C++. For managing heap allocated memory, Rust uses a C++ style copy/move semantics. The difference to C++ is that the Rust compiler checks that there cannot be any memory violations.
Strings can grow and shrink (=> strings have a dynamic size)
When calling a function, we pass a reference to the string (not the string itself)
Variable s
holds the reference to the
string
When calling str
, parameter x
obtains
the reference to the string
fn test2() {
let s = String::from("12345");
str(s);
str(s); // Compiler error "use of moved value: `s`"
}
The first call str(s)
copies the reference to the
string
From C++ we know that this can be dangerous (see “double delete” problem)
Rust performs a move like C++
The difference to C++ is that the Rust compiler (type system)
checks that (after the move) we no longer access variable
s
Rust uses a refined type system to carry out ownership checks.
fn test2() {
let s = String::from("12345"); // s : String_Owner
str(s); // s : String_Not_Owner
str(s); // Type error invalid access of s
}
Like in C++, we can explicitly apply the copy semantics by using the clone method.
fn test3() {
let s = String::from("12345"); // s : String_Owner
str(s.clone());
let s2 = s.clone(); // s2 : String_Owner
str(s2);
println!("{}",s);
}
Always cloning an object seems overkill. In particular, if only
require read access like in case of function
str
.
There is a way out here.
fn strB(x : &String) { // "borrow" Annotation
println!("{}",x);
}
fn test4() {
let s = String::from("12345");
strB(&s); // "borrow" Annotation
strB(&s);
}
Borrow access to the string
Requires explicit annotations
Borrow annotations guarantee that the object can only be read.
fn strB2(x : &String) {
x.push('a'); // Yields compiler error
println!("{}",x);
}
fn str3(x : String) {
let mut y = x;
y.push('a'); // Okay
println!("{}",y);
}
Java, Go, … automatic garbage collection
RAII in C++ (double delete issue remains)
C++11 copy/move semantics solve some of the problem (but we may still encounter run-time failure in case we access a “moved” object)
Rust uses a static ownership type system (checked by the compiler)
Rust guarantees that no memory violations occur at run-time
We show how to implement a stack-based virtual machine (VM) to support a simple expression language.
We first consider the virtual machine. We make use of Rust enums to represent VM code.
Rust enums can take arguments (identified via a label/attribute name).
We write an interpreter where we make use of Rust vectors to (a) represent a sequence of VM instructions (codes) and (b) represent the stack to carry out the computation of VM instructions.
fn is_some<T>(x : Option<T>) -> T {
match x {
Some(v) => { return v; }
_ => { panic!(); }
}
}
fn run(op_codes : &Vec<OpCode>) -> i32 {
let mut st = Vec::new();
for op in op_codes.iter() {
match op {
OpCode::PUSH { val } => { st.push(*val); }
OpCode::PLUS => {
match st.pop() {
Some(v_right) => {
match st.pop() {
Some(v_left) => { st.push(v_left+v_right); }
_ => { panic!(); }
}
}
_ => { panic!(); }
}
}
OpCode::MULT => {
// Shorter code using the helper function.
let v_right = is_some(st.pop());
let v_left = is_some(st.pop());
st.push(v_left*v_right);
}
}
}
let res = is_some(st.pop());
return res;
}
The vector data type acts like a stack by using operations
push
and pop
.
A pop
operation either yields some result or none.
We can observe the various shapes a data type can have by using a form of pattern matching.
Consider the helper function
where the option type is defined as follows
The pattern cases connected to a match
expression allows
us to observe the various shapes of values.
For example, the pattern case
applies if there is “some” value. Pattern variable v
refers to the underlying value.
The pattern case _
refers to a “don’t care” pattern and
always applies.
Keep in mind that pattern cases are tried in textual order (from top to bottom). Hence, the don’t care case always comes last.
fn test_vm() {
let mut os = Vec::new();
os.push(OpCode::PUSH{ val : 1 });
os.push(OpCode::PUSH{ val : 2 });
os.push(OpCode::PUSH{ val : 3 });
os.push(OpCode::MULT);
os.push(OpCode::PLUS);
println!("{}", run(&os));
}
Next, we consider expressions that are composed of integers, addition and multiplication. We again make use of Rust enums to represent the abstract syntax of our expression language. What abstract means will be explained shortly.
pub enum Exp {
Int {
val: i32
},
Plus {
left: Box<Exp>,
right: Box<Exp>
},
Mult{
left: Box<Exp>,
right: Box<Exp>
},
}
Box
required
Otherwise, recursive definition (without indirection)
Rust enums are a well-known concept and commonly referred to as algebraic data types.
For example, the expression 1 * (2 + 3)
can be
represented as follows
let e2 = Exp::Mult{left : Box::new(Exp::Int { val : 1 }),
right : Box::new(Exp::Plus{left : Box::new(Exp::Int { val : 2 }),
right : Box::new(Exp::Int { val : 3})})};
In terms of some concrete syntax
written 1 * (2 + 3)
where where parentheses are necessary
to capture the precedence among (sub)expressions.
In the abstract syntax
representation we use Rust enum constructors Int
,
Plus
and Mult
and parentheses are omitted.
We make use of pattern matching to implement an evaluator for our expressions.
fn eval(e : &Exp) -> i32 {
match e {
Exp::Int { val } => return *val,
Exp::Plus { left, right } => return eval(left) + eval(right),
Exp::Mult { left, right } => return eval(left) * eval(right),
}
}
The second case Exp::Plus { left, right }
refers to an
expression that is composed using +
where (pattern)
variables left
and right
refer to the
operands.
fn test_exp() {
// 1 + (2 * 3)
let e = Exp::Plus{left : Box::new(Exp::Int { val : 1 }),
right : Box::new(Exp::Mult{left : Box::new(Exp::Int { val : 2 }),
right : Box::new(Exp::Int { val : 3})})};
println!("{}", eval(&e));
// 1 * (2 + 3)
let e2 = Exp::Mult{left : Box::new(Exp::Int { val : 1 }),
right : Box::new(Exp::Plus{left : Box::new(Exp::Int { val : 2 }),
right : Box::new(Exp::Int { val : 3})})};
println!("{}", eval(&e2));
}
We write a compiler from Exp
to VM instructions
(Vec<OpCode>
).
fn compile(e : &Exp) -> Vec<OpCode> {
match e {
Exp::Int { val } => {
let mut os = Vec::new();
os.push(OpCode::PUSH { val : *val });
return os;
}
Exp::Plus { left, right } => {
let mut o1 = compile(left);
let mut o2 = compile(right);
o1.append(&mut o2);
let mut o3 = Vec::new();
o3.push(OpCode::PLUS);
o1.append(&mut o3);
return o1;
}
Exp::Mult { left, right } => {
let mut o1 = compile(left);
let mut o2 = compile(right);
o1.append(&mut o2);
let mut o3 = Vec::new();
o3.push(OpCode::MULT);
o1.append(&mut o3);
return o1;
}
}
}
append
“moves” the to be appended list
This requires the addition of &mut
in several
places.
The complete source can be found here.
Traits are a systematic method to support user-definable overloading.
Traits are not classes and they only vaguely resemble interfaces.
#[derive(Clone)]
struct Square {
x : i32
}
fn area_sq(s : Square) -> i32 {
return s.x * s.x;
}
fn sq_test() {
let s = Square{ x : 2 };
let s2 = s.clone();
println!("{}", area_sq(s2));
println!("{}", area_sq(s));
}
clone
is an overloaded method
We can automatically derive an instance for
Square
#[derive(Clone,Copy)]
struct Rectangle {
x : i32,
y : i32
}
fn area_rec(r : Rectangle) -> i32 {
return r.x * r.y;
}
fn rec_test() {
let r = Rectangle{ x : 2, y : 4 };
let r2 = r;
println!("{}", area_rec(r2));
println!("{}", area_rec(r));
}
copy
applied in case of assignment and function
calls is another overloaded method
We can automatically derive copy and clone instances for
Rectangle
The program text let r2 = r;
translates to
let r2 = r.clone();
Traits describe a collection of methods that share the same type.
Via some clever sugared syntax, Rust makes them look like “objects”.
trait Shape {
fn area(s : &Self) -> i32;
}
impl Shape for Rectangle {
fn area(r : &Rectangle) -> i32 {
return r.x * r.y;
}
}
impl Shape for Square {
fn area(s : &Square) -> i32 {
return s.x * s.x;
}
}
Self
refers to the shared type
Self
is an implicit parameter of the trait
Shape
In essence, the trait Shape
represents a predicate
Shape(Self)
We use borrowing here to avoid copying
Summing up the area of two geometric objects.
fn sum_area<A:Shape,B:Shape>(x : &A, y : &B) -> i32 {
return Shape::area(x) + Shape::area(y);
}
fn test_area() {
let r = Rectangle{x : 1, y : 2};
let s = Square{x : 3};
println!("{}",sum_area(&r,&s));
}
A:Shape
is a type constraint and states that
x
implements the trait Shape
In standard logic notation written Shape(A)
Method calls are written Shape::area(x)
Rust introduces some syntactic sugar to make look method calls “nicer”.
Here’s the above example written using Rust’s “method” notation.
trait Shape2 {
fn area2(&self) -> i32;
}
impl Shape2 for Rectangle {
fn area2(&self) -> i32 {
return self.x * self.y;
}
}
impl Shape2 for Square {
fn area2(&self) -> i32 {
return self.x * self.x;
}
}
fn sum_area2<A:Shape2,B:Shape2>(x : &A, y : &B) -> i32 {
return x.area2() + y.area2();
}
fn test_area2() {
let r = Rectangle{x : 1, y : 2};
let s = Square{x : 3};
println!("{}",sum_area2(&r,&s));
}
Traits may seem like interfaces but they are not.
Interfaces
Interfaces are types
A value of an interface type implements the methods described by the interface
Traits
Traits describe a collection of methods that share the same type
Traits are type constraints
In Rust, we can represent interfaces via dynamic traits.
fn sum_area3(x : Box<dyn Shape2>, y : Box<dyn Shape2>) -> i32 {
return x.area2() + y.area2();
}
fn test_area3() {
let r = Box::new(Rectangle{x : 1, y : 2});
let s = Box::new(Square{x : 3});
println!("{}",sum_area3(r,s));
}
dyn Shape2
effectively represents an
interface
In logical notation, dyn Shape2
corresponds to
exists A. Shape2(A) => A
What does Box
mean?
The memory space of an interface cannot be statically computed
It can either be a Rectangle or a Square (the size of the memory space occupied is different)
Hence, we need to allocate the (interface) value on the heap.
In Rust, this is reflected in the type via
Box
The complete source can be found here.
We have seen:
Ownership and borrowing
Data types and pattern matching
Overloading via traits
How to describe Rust?
Rust = Haskell with deterministic memory management
Learn Haskell!
Makes it easier to comprehend Rust (and other languages such as Go, …).