Martin Sulzmann
Try out the runnable Go code that implements roman numbers.
Two roman numerals (represented as strings) enter the arena. The larger numeral is the “Winner”. If they are equal, we have a “Draw”.
Your task is to implement the following function.
type Result int
const (
Failure Result = 0
Winner Result = 1
Draw Result = 2
)
func battle(a, b string) (string, Result)
To implement this function make use of parse
and
eval
. Case “Failure” arises if at least one of the strings
is ill-formed.
Given a list of roman numerals (represented as strings), return them sorted by their numeric value.
Your task is to implement the following function.
Failure (“false”) arises if at least one of the strings is ill-formed.
Hints.
Make sure that all strings are well-formed.
Use Go’s sort.Slice
function where we provide an
implementation of less
based on parse
+
eval
.
Implement some (stronger) well-formedness rules. For example, there
shall be no more than four I
s in a row. Modify
parse
to reject ill-formed inputs.
Your task is to implement the following function.
Hints.
Here some helper functions that you might consider useful.
// Check if p is a prefix of xs.
func preFixMatch[T comparable](xs []T, p []T) bool {
if len(xs) < len(p) {
return false
}
for i := 0; i < len(p); i++ {
if xs[i] != p[i] {
return false
}
}
return true
}
// Check if p is occurs in xs.
func hasSubPattern[T comparable](xs []T, p []T) bool {
// Empty pattern always matches.
if len(p) == 0 {
return true
}
for i := 0; i < len(xs); i++ {
// Start of pattern found
if xs[i] == p[0] {
if preFixMatch(xs[i:], p) {
return true
}
// No match, keep looking for start
}
}
return false
}
The helper functions make use of generics. The type abstraction
[T comparable]
states that we introduce a generic
(universally quantified) type parameter T
where each
instance of T
must be comparable
. See the use
of “!=” and “==” in the above program code.
Your task is to write a function that turns any integer between 0-3999 into a roman numeral.
A simple solution would be to represent integer n by repeating
I
n times.
A less naive method would (greedily) subtract the “largest” roman numeral repeatedly. For example, consider input n = 1900.
1. We start with the empty sequence of roman numerals.
2. From n = 1900 we subtract 1000 and append M.
3. From n = 900 = 1900-1000 we subtract 500 and append D.
4. From n = 400 = 900-500 we subtract 100 and append C.
5. From n = 300 = 400-100 we subtract 100 and append C.
6. From n = 200 = 300-100 we subtract 100 and append C.
6. From n = 100 = 200-100 we subtract 100 and append C.
We are done.
The resulting sequence of roman numerals is [M,D,C,C,C,C].
The issue is that there is a more “compact” representation for 1900. For example, eval([M,C,M]) => 1900.
Instead of subtracting the largest numeral, we subtract a sequence of numerals. For example, in step 3 we subtract 900 represented by [C,M].
We generalize this idea by building the following pairs consisting of a value and its associated sequence of roman numerals.
(1000, [M])
(900, [C,M])
(500, [D])
(400, [C,D])
(100, [C])
(90, [X,C])
(50, [L])
(40, [X,L])
(10, [X])
(9, [I,X])
(5, [V])
(4, [I,V])
(1, [I])
Consider any two consecutive pairs
(n1, p1)
(n2, p2)
where ni is a number and pi is sequence of roman letters such that eval(pi) => ni.
We find that n1 < n2 and there is no other pair (n,p) such that (a) n1 < n < n2 and (b) p consists of at most two roman letters.
Condition (b) is important. For example, the pair (950, [C,M,L]) fits in between (1000, [M]) and (900, [C,M]) but [C,M,L] consists of more than two roman letters.
We refine the greedy subtraction method as follows.
1. Given some n in between 0 and 3999.
2. Pick the largest pair (k,p) where k <= n.
3. Append p to the output, and set n = n - k.
4. Repeat until n = 0.
By using the above pairs we can guarantee that the resulting roman numeral is in canonical form. A roman numeral [x1,…,xn] in canonical canonical form there is no sub-sequence [xi,…,xi+3] such that xi, xi+1, xi+2, xi+3 denote the same roman letter. In words, the canonical form guarantees that there are at most three consecutive equal letters.
Implement the following function to generate a roman number in canonical form.
Hints. Below you find a sketch.
var numerals = []struct {
value int
sym []RomanLetter
}{
{1000, []RomanLetter{M}},
{900, []RomanLetter{C,M}},
{500, []RomanLetter{D}},
{400, []RomanLetter{C,D}},
{100, []RomanLetter{C}},
{90, []RomanLetter{X,C}},
{50, []RomanLetter{L}},
{40, []RomanLetter{X,L}},
{10, []RomanLetter{X}},
{9, []RomanLetter{I,X}},
{5, []RomanLetter{V}},
{4, []RomanLetter{I,V}},
{1, []RomanLetter{I}},
}
// There are more efficient ways to concatenate two strings.
func concat[T any](x, y []T) []T {
if len(y) == 0 {
return x
}
return concat(append(x,y[0]),y[1:])
}
func toRoman(n int) ([]RomanLetter,bool) {
if n <= 0 || n > 3999 {
return []RomanLetter{},false // undefined outside standard range
}
var ls []RomanLetter
for _, pair := range numerals {
for n >= pair.value {
// TODO
}
}
return ls,true
}
The helper function concat
is another example of a
generic function where we abstract over type variable T
.
The type bound any
is equivalent to the empty interface
(interface{}
). In Java, any
corresponds to
Object
. Clearly, any type T
satisfies the
any
type bound.
Extend the implementation with an additional roman letter
N
standing for ten thousand. Summarize which code parts are
affected.
Consider an extension of the interface-based roman numbers implementation. Are there any differences compared to the “other” extension?
The eval
function is implemented recursively. Can this
lead to issues?
Rewrite eval
iteratively (no recursion). We refer to the
resulting version as evalIter
. Think about how to check
that eval
and evalIter
behave the same.
Prove that evaluation of roman numbers always yields a positive number.
Lemma A: For any roman number [x1,...,xn] we find that [x1,...,xn] => n where n >= 0.
Proof. By induction on the sequence of roman numbers.
Case [x]:
[x] => val(x)
val(x) >= 0
Hence, the statement holds.
Case [x1,x2,...,xn]:
We assume that [x2,...,xn] => n.
(1) By induction we find that n >= 0.
Subcase x1 >= x2:
This means that val(x1) >= val(x2).
By application of the semantic rules we find that [x1,x2,...,xn] => val(x1)+n
We find that val(x1)+n >= 0 because val(x1) >= 0.
Subcase x1 < x2:
(2) This means that val(x1) < val(x2).
By application of the semantic rules we find that [x1,x2,...,xn] => n - val(x1)
We need to show that n - val(x1) >= 0.
At this point we are stuck.
We have that n >= 0 and val(x1) < val(x2).
But this is not enough to show that n - val(x1) >= 0.
Lemma B: Let x and y be any two roman letters where val(x) > val(y).
Then, we find that val(x) >= 2 * val(y).
Proof. By case analysis.
Consider y = I. Take x = V. We have that val(y) = 1 and val(x) = 5.
We find that val(x) >= 2 * val(y).
The result holds for x = X and so on.
Consider y = V. Take x = X. We have that val(y) = 5 and val(x) = 10.
We find that val(x) >= 2 * val(y).
The result holds for x = L and so on.
Consider y = X. Take x = L. We have that val(y) = 10 and val(x) = 50.
We find that val(x) >= 2 * val(y).
The result holds for x = C and so on.
Consider y = L. Take x = C. We have that val(y) = 50 and val(x) = 100.
We find that val(x) >= 2 * val(y).
The result holds for x = D and so on.
Consider y = C. Take x = D. We have that val(y) = 100 and val(x) = 500.
We find that val(x) >= 2 * val(y).
The result holds for x = M and so on.
Consider y = D. Take x = M. We have that val(y) = 500 and val(x) = 1000.
We find that val(x) >= 2 * val(y).
The result holds for x = M and so on.
The case y = M does not apply.
QED.
Lemma C: For any roman number [x1,...,xn] we find that [x1,...,xn] => n where n >= val(x1).
Proof. By induction on the sequence of roman numbers.
Case [x]:
[x] => val(x)
val(x) >= val(x)
Hence, the statement holds.
Case [x1,x2,...,xn]:
We assume that [x2,...,xn] => n.
(1) By induction we find that n >= val(x2).
Subcase x1 >= x2:
This means that val(x1) >= val(x2).
By application of the semantic rules we find that [x1,x2,...,xn] => val(x1)+n
We find that val(x1)+n >= val(x1) because n >= val(x2) >= 0 via (1).
Subcase x1 < x2:
(2) This means that val(x1) < val(x2).
By application of the semantic rules we find that [x1,x2,...,xn] => n - val(x1)
We need to show that (3) n - val(x1) >= val(x1) iff n >= 2 * val(x1).
By induction we have that n >= val(x2).
So, we need to show that (4) val(x2) >= 2 * val(x1).
In fact, (4) holds for all possible combinations that satisfy (2).
See Lemma B.
Hence, (3) holds and we are done.
QED.
Here’s the desired result.
Proposition: For any roman number [x1,...,xn] we find that [x1,...,xn] => n where n >= 0.
Proof:
We apply Lemma B and find that n >= val(x1).
For any roman letter we find that val(x1) > 0.
Hence, n >= 0.
QED.
Proposition. For every nonempty Roman sequence s (sequence of letters), eval(s) > 0.
(We can even say ≥ 1 since val(letter) ≥ 1 for every letter.)
Proof (by minimal counterexample).
Suppose the proposition is false. Then there exists a nonempty sequence s with eval(s) ≤ 0. Choose such an s of minimal length (so every shorter nonempty sequence evaluates to > 0).
Since every single-letter sequence evaluates to val(letter) ≥ 1, the minimal counterexample must have length ≥ 2. Write s = x1 x2 ... xn and let t = x2 ... xn (the tail). By minimality of s we have eval(t) > 0 (because t is shorter than s).
Now consider the first two symbols x1 and x2. Two cases:
1. Case val(x1) ≥ val(x2): then by the semantics
eval(s) = val(x1) + eval(t).
Both summands are strictly positive (val(x1) ≥ 1 and eval(t) > 0), so eval(s) > 0. This contradicts the assumption that s is a counterexample.
2. Case val(x1) < val(x2): then by the semantics
eval(s) = eval(t) - val(x1).
We already know eval(t) > 0. If eval(t) - val(x1) ≤ 0 (so s would be a counterexample), then
eval(t) ≤ val(x1) < val(x2).
Now look inside t. We have t = x2 u. Consider evaluating t. There are again two subcases for x2 vs x3:
If val(x2) ≥ val(x3), then eval(t) = val(x2) + eval(u) ≥ val(x2) ≥ 1. But we just deduced eval(t) ≤ val(x1) < val(x2), a contradiction with eval(t) ≥ val(x2). So this subcase cannot occur.
Therefore val(x2) < val(x3). (If t had x2 ≥ x3, we'd get a contradiction as just shown.) Then eval(t) = eval(u) - val(x2). Repeating the same reasoning: if eval(t) ≤ val(x1) < val(x2) then eval(u) must be ≤ val(x2) and comparing val(x3) and val(x4) yields similar reasoning.
Continue this argument scanning rightwards: each time you find val(x_k) < val(x_{k+1}), the tail's evaluation equals the evaluation of a strictly shorter suffix minus val(x_k). Since the sequence is finite, this process eventually reaches the last two elements x_{n-1}, x_n. At that final step you will encounter the situation eval([x_n]) = val(x_n) which is ≥ val(x_n), but the chain of inequalities you were forced to maintain would imply val(x_n) ≤ val(x_{n-1}) (because we were maintaining eval(suffix) ≤ val(x1) < val(x2) < ...), contradicting the fact that val(x_{n-1}) < val(x_n) at that step. Concretely, the chain cannot continue forever; at the last step you get the contradiction val(x_n) ≤ something < val(x_n).
This contradiction shows the assumption eval(t) ≤ val(x1) cannot hold, so eval(t) - val(x1) > 0. Hence eval(s) > 0, a contradiction.
In both cases we reach a contradiction. Therefore no counterexample s exists. Thus every nonempty sequence evaluates to a strictly positive integer.
QED.