Motivating example - counting words

-- | Counting words.
-- Word = Sequence of characters not separated by white spaces (blanks).
count :: String -> Int
count [] = 0
count (c:cs)
  | c == ' ' = count $ skipBlanks cs
  | otherwise = 1 + count (skipWord cs)

-- | Generic skip function.
skip :: (Char -> Bool) -> String -> String
skip p [] = []
skip p (c:cs)
 | p c       = skip p cs
 | otherwise = c:cs

skipWord   = skip (/= ' ')
skipBlanks = skip (== ' ')

$ is pre-defined of type (a -> b) -> a -> b. Instead of writing count (skipBlanks cs) we can simply write count $ skipBlanks cs (and save some parentheses).

User tests

*Main> skipWord " abc 123"
" abc 123"
*Main> skipBlanks " abc 123"
"abc 123"
*Main> count " abc 123"
2

Seems to work!?

Systematic testing

-- Boilerplate code for writing tests
type Assertion = IO ()
assertEqual :: (Eq a, Show a) => String -> a -> a -> Assertion
assertEqual what expected given =
    if expected == given
    then putStrLn ("[ok] " ++ what)
    else fail
             (what ++ ": assertEqual failed. Expected: " ++ show expected ++
               ", given: " ++ show given)

Some test cases

test_count :: Assertion
test_count = do assertEqual "Test1" 0 (count "")
                assertEqual "Test2" 1 (count "Hallo  ")
                assertEqual "Test3" 2 (count "Hallo  123")
                assertEqual "Test4" 2 (count "Hallo  123 bcd ")

Observations

Writing tests by hand is painful!

How many test cases are enough?
Need to make sure that expected results are correct!

For our example, test case “Test4” yields failure because the expected result 2 is incorrect.

Complete source code


-- Tests/Assertions in Haskell

-- | Counting words.
-- Word = Sequence of characters not separated by white spaces (blanks).
count [] = 0
count (c:cs)
  | c == ' ' = count $ skipBlanks cs
  | otherwise = 1 + count (skipWord cs)

-- | Generic skip function.
skip p [] = []
skip p (c:cs)
 | p c       = skip p cs
 | otherwise = c:cs

skipWord   = skip (/= ' ')
skipBlanks = skip (== ' ')


-- Boilerplate code for writing tests
type Assertion = IO ()
assertEqual :: (Eq a, Show a) => String -> a -> a -> Assertion
assertEqual what expected given =
    if expected == given
    then putStrLn ("[ok] " ++ what)
    else fail
             (what ++ ": assertEqual failed. Expected: " ++ show expected ++
               ", given: " ++ show given)


test_count :: Assertion
test_count = do assertEqual "Test1" 0 (count "")
                assertEqual "Test2" 1 (count "Hallo  ")
                assertEqual "Test3" 2 (count "Hallo  123")
                assertEqual "Test4" 2 (count "Hallo  123 bcd ")

QuickCheck - The idea

quickCheck prop = go 100
   where go 0 = putStrLn "+++ Ok"
         go n = do x <- arbitrary                        -- Generate test input data x
                   if prop x                             -- Check if x satisfies property prop
                     then go (n-1)
                     else do putStrLn "*** Failed: "
                             print x

Generate test input data x
Check if x satisfies property prop

Properties

Properties are effectively predicates.

We may also call them invariants, assertions.

For the “count” example, here is a possible property.

prop1 :: String -> Bool
prop1 s = count s >= 0

The Number of words counted must be greater or equal zero
This must hold for any string

Test data generation

For the above property we wish to generate “arbitrary” strings. How?

Random?
Via some AI?

We introduce a clever library so we can quickly generate arbitrary strings (and possibly arbitrary values for any test input type).

Type-specific test data generation

In general, a test case consists of the following.

Some input value.
Some unit on which we want to apply the input.
Check that for the given input the result matches our expectations.

In case of a unit test, we need to

manually provide the input value
manually check that the result matches our expectations.

We wish to automate the process of generating test inputs.

Introduce generator functions = Clever functions to program arbitrary distribution of test data.
Support arbitrary test data = Use type class overloading to systematically extend the set of types that are supported.

Generating arbitrary values for specific types

In the following, we give a simplified account of the Test.QuickCheck Haskell library. We refer to our own version as “QuickCheck light”.

The Arbitrary type class

class Arbitrary a where
   arbitrary :: IO a

arbitrary yields a random value of type a
The (arbitrary) computation might rely on side effects. For example, a random number generator.
Hence, the computation of arbitrary values takes place within the IO monad.

Generators

Useful functions to arbitrarily generate values.

-- | Choose one of the elements.
elements :: [a] -> IO a
elements xs = do i <- randomIO :: IO Int
                 return (xs !! (i `mod` (length xs)))

-- | Generate a fixed number of arbitrary values.
vector ::  Arbitrary a => Int -> IO [a]
vector 0 = return []
vector n
    | n > 0  = do x <- arbitrary
                  xs <- vector (n-1)
                  return (x:xs)
    | otherwise = error "impossible"

Arbitrary instances

instance Arbitrary Char where
   arbitrary = do x <- elements [0..255]
                  return (chr x)

instance Arbitrary a => Arbitrary [a] where
   arbitrary = vector 5

instance (Arbitrary a, Arbitrary b) => Arbitrary (Either a b) where
   arbitrary = do b <- randomIO
                  if b
                    then do l <- arbitrary
                            return $ Left l
                    else do r <- arbitrary
                            return $ Right r

Based on such a library, a test engineer can automate the process of generating test inputs for specific types.

Application: Generating arbitrary strings

Consider

genStrings :: IO ()
genStrings = do xs <- arbitrary
                putStrLn xs

and some of the random strings that we might produce.

*Main> genStrings
]Ý
*Main> genStrings
ÃÀ½
*Main> genStrings
öë
*Main> genStrings
BQv«

Property-based testing

Recall.

A test case consists of the following.

Some input value.
Some unit on which we want to apply the input.
Check that for the given input the result matches our expectations.

We have automated the process of generating random inputs. See type-specific test data generation.

How can we automate the process of checking that the result matches our expectations? Idea. Identify some properties that must be satisfied. Think of a property as an assertion.

Here are some properties for the “count” example.

-- | Number of words counted must be greater or equal zero.
prop1 :: String -> Bool
prop1 s = count s >= 0

-- | Reversing the string yields the same number of words.
prop2 :: String -> Bool
prop2 s = count s == count (reverse s)

We can quickly check each property

by generating some (say 100) arbitrary test data, and
apply the property on each test data.

quickCheck :: (Show t, Arbitrary t) => (t -> Bool) -> IO ()
quickCheck prop = go 100
   where go 0 = putStrLn "+++ Ok"
         go n = do x <- arbitrary
                   if prop x
                     then go (n-1)
                     else do putStrLn "*** Failed: "
                             print x

We find that

*Main> quickCheck prop1
+++ Ok
*Main> quickCheck prop2
+++ Ok

Here’s another property.

-- | Concatenating the string doubles the number of words.
prop3 :: String -> Bool
prop3 s = 2 * count s == count (s ++ s)

We find that

*Main> quickCheck prop3
*** Failed:
"\129k\DC4p\130"

Summary

Properties may not hold in general. See prop3.
Satisfying all properties does not mean that our program is bug-free.

The following “count” variant satisfies for all inputs prop1, prop2 and prop3.

count2 :: [String] -> Int
count2 _ = 0

Type-specific test data generation and property-based testing are highly useful methods to catch bugs in your programs. They are not meant to substitute for unit tests. The more diverse testing methods we apply, the better are our changes that we catch bugs.

Complete source code

-- | QuickCheck light.
-- The following is re-implementation of the essential features of QuickCheck.

import System.Random (randomIO)
import Data.Char (chr)

class Arbitrary a where
   arbitrary :: IO a

-- Some generators

-- | Choose one of the elements.
elements :: [a] -> IO a
elements xs = do i <- randomIO :: IO Int
                 return (xs !! (i `mod` (length xs)))

-- | Generate a fixed number of arbitrary values.
vector ::  Arbitrary a => Int -> IO [a]
vector 0 = return []
vector n
    | n > 0  = do x <- arbitrary
                  xs <- vector (n-1)
                  return (x:xs)
    | otherwise = error "impossible"


-- Some Arbitrary instances

instance Arbitrary Char where
   arbitrary = do x <- elements [0..255]
                  return (chr x)

instance Arbitrary a => Arbitrary [a] where
   arbitrary = vector 5

instance (Arbitrary a, Arbitrary b) => Arbitrary (Either a b) where
   arbitrary = do b <- randomIO
                  if b
                    then do l <- arbitrary
                            return $ Left l
                    else do r <- arbitrary
                            return $ Right r


-- | Random generation of strings
genStrings :: IO ()
genStrings = do xs <- arbitrary
                putStrLn xs

-- | Quickly check a property by testing the property
-- against 100 arbitrarily generated test inputs.
quickCheck :: (Show t, Arbitrary t) => (t -> Bool) -> IO ()
quickCheck prop = go 100
   where go 0 = putStrLn "+++ Ok"
         go n = do x <- arbitrary
                   if prop x
                     then go (n-1)
                     else do putStrLn "*** Failed: "
                             print x



-- Property-based testing for count

-- | Counting words.
-- Word = Sequence of characters not separated by white spaces (blanks).
count :: String -> Int
count [] = 0
count (c:cs)
  | c == ' ' = count $ skipBlanks cs
  | otherwise = 1 + count (skipWord cs)

-- | Generic skip function.
skip :: (Char -> Bool) -> String -> String
skip p [] = []
skip p (c:cs)
 | p c       = skip p cs
 | otherwise = c:cs

skipWord   = skip (/= ' ')
skipBlanks = skip (== ' ')

-- | Number of words counted must be greater or equal zero.
prop1 :: String -> Bool
prop1 s = count s >= 0

-- | Reversing the string yields the same number of words.
prop2 :: String -> Bool
prop2 s = count s == count (reverse s)


-- | Concatenating the string doubles the number of words.
-- NOTE: This property does not hold in general!
prop3 :: String -> Bool
prop3 s = 2 * count s == count (s ++ s)

Test.QuickCheck

We consider the “count” example where we make use of Test.QuickCheck. Test.QuickCheck is the official QuickCheck library and subsumes the functionality of “QuickCheck light”.

*Main> quickCheck prop1
+++ OK, passed 100 tests.
*Main> quickCheck prop2
+++ OK, passed 100 tests.
*Main> quickCheck prop3
*** Failed! Falsifiable (after 4 tests and 2 shrinks):
"a"

Test.QuickCheck provides many more features compared to our own simple prototype. One feature is to “shrink” failed cases. That is, in case of failure we seek for “smaller” counter-examples.

Instead of quickCheck we can use verboseCheck to let QuickCheck tell us about the test cases applied.

*Main> verboseCheck prop3
Passed:
""
Failed:
"\202"
*** Failed! Passed:
""
Failed:
"a"
Passed:
""
Falsifiable (after 2 tests and 1 shrink):
"a"

Complete source code

-- QuickCheck by example
import Test.QuickCheck


-- | Counting words.
-- Word = Sequence of characters not separated by white spaces (blanks).
count :: String -> Int
count [] = 0
count (c:cs)
  | c == ' ' = count $ skipBlanks cs
  | otherwise = 1 + count (skipWord cs)

-- | Generic skip function.
skip :: (Char -> Bool) -> String -> String
skip p [] = []
skip p (c:cs)
 | p c       = skip p cs
 | otherwise = c:cs

skipWord   = skip (/= ' ')
skipBlanks = skip (== ' ')



-- | Number of words counted must be greater or equal zero.
prop1 :: String -> Bool
prop1 s = count s >= 0

-- | Reversing the string yields the same number of words.
prop2 :: String -> Bool
prop2 s = count s == count (reverse s)


-- | Concatenating the string doubles the number of words.
-- NOTE: This property does not hold in general!
prop3 :: String -> Bool
prop3 s = 2 * count s == count (s ++ s)

quickCheckN n = quickCheckWith $ stdArgs { maxSuccess = n }

verboseCheckN n = verboseCheckWith $ stdArgs { maxSuccess = n }

Test.QuickCheck - Customizing test data generation

Test.QuickCheck already provides Arbitrary instances for String.

For use cases such as “count”, we wish to customize generation of strings.

As we cannot override (or replace) existing Arbitrary instances we introduce an algebraic data type to represent “my strings”.

data MyString = MkStr String deriving (Show, Eq)

We now can define our own Arbitrary instances.

-- Maximum of 4 blanks.
-- Maximum of 10 words.
-- Words consist of upper/lower-case letters only.
instance Arbitrary MyString where
   arbitrary =
      do n <- elements [1..10]
         ws <- vectorOf n word
         ws2 <- mapM (\w -> do b1 <- elements [0..2]
                               b2 <- elements [0..2]
                               return $ blanks b1 ++ w ++ blanks b2) ws
         return $ MkStr $ concat ws2
         where
           blanks n = take n (repeat ' ')
           letter = elements $ ['a'..'z'] ++ ['A'..'Z']
           word = do n <- elements [1..5]
                     vectorOf n letter

From MyString to String and back

quickCheck generates random inputs based on the type of propositions provided. Hence, the type signatures of “count” and its properties needs to be rephrased in terms of MyString.

MyString-String conversions

toString :: MyString -> String
toString  (MkStr xs) = xs

fromString :: String -> MyString
fromString = MkStr

MyString wrappers

reverseMyString = fromString . reverse . toString

appendMyString x y = fromString $ (toString x) ++ (toString y)

countMyString :: MyString -> Int
countMyString = count . toString

Properties 1-3 stated in terms of MyString.

-- | Number of words counted must be greater or equal zero.
prop1_MyString :: MyString -> Bool
prop1_MyString s = countMyString s >= 0

-- | Reversing the string yields the same number of words.
prop2_MyString :: MyString -> Bool
prop2_MyString s = countMyString s == countMyString (reverseMyString s)


-- | Concatenating the string doubles the number of words.
prop3_MyString :: MyString -> Bool
prop3_MyString s = 2 * countMyString s == countMyString (s `appendMyString` s)

Complete source code


-- | Application.
-- Customize generation of inputs for the "count" example.

import Test.QuickCheck


-- | Counting words.
-- Word = Sequence of characters not separated by white spaces (blanks).
count :: String -> Int
count [] = 0
count (c:cs)
  | c == ' ' = count $ skipBlanks cs
  | otherwise = 1 + count (skipWord cs)

-- | Generic skip function.
skip :: (Char -> Bool) -> String -> String
skip p [] = []
skip p (c:cs)
 | p c       = skip p cs
 | otherwise = c:cs

skipWord   = skip (/= ' ')
skipBlanks = skip (== ' ')



-- | Number of words counted must be greater or equal zero.
prop1 :: String -> Bool
prop1 s = count s >= 0

-- | Reversing the string yields the same number of words.
prop2 :: String -> Bool
prop2 s = count s == count (reverse s)


-- | Concatenating the string doubles the number of words.
-- NOTE: This property does not hold in general!
prop3 :: String -> Bool
prop3 s = 2 * count s == count (s ++ s)

quickCheckN n = quickCheckWith $ stdArgs { maxSuccess = n }

verboseCheckN n = verboseCheckWith $ stdArgs { maxSuccess = n }

-- | Algebraic data type to represent "my strings".

data MyString = MkStr String deriving (Show, Eq)

-- Maximum of 4 blanks.
-- Maximum of 10 words.
-- Words consist of upper/lower-case letters only.
instance Arbitrary MyString where
   arbitrary =
      do n <- elements [1..10]
         ws <- vectorOf n word
         ws2 <- mapM (\w -> do b1 <- elements [0..2]
                               b2 <- elements [0..2]
                               return $ blanks b1 ++ w ++ blanks b2) ws
         return $ MkStr $ concat ws2
         where
           blanks n = take n (repeat ' ')
           letter = elements $ ['a'..'z'] ++ ['A'..'Z']
           word = do n <- elements [1..5]
                     vectorOf n letter

-- | Conversions to/from MyString-String.
toString :: MyString -> String
toString  (MkStr xs) = xs

fromString :: String -> MyString
fromString = MkStr

-- | MyString wrappers for list functions
reverseMyString = fromString . reverse . toString

appendMyString x y = fromString $ (toString x) ++ (toString y)

countMyString :: MyString -> Int
countMyString = count . toString

-- | Number of words counted must be greater or equal zero.
prop1_MyString :: MyString -> Bool
prop1_MyString s = countMyString s >= 0

-- | Reversing the string yields the same number of words.
prop2_MyString :: MyString -> Bool
prop2_MyString s = countMyString s == countMyString (reverseMyString s)


-- | Concatenating the string doubles the number of words.
-- NOTE: This property does not hold in general!
prop3_MyString :: MyString -> Bool
prop3_MyString s = 2 * countMyString s == countMyString (s `appendMyString` s)

Test.QuickCheck - Test data generation for trees

We consider another example where we use Test.QuickCheck for the purpose of generating test data. We consider the arbitrary generation of binary trees of values of type Int.


import Test.QuickCheck

-- QuickCheck for test data generation


-- Binary trees
data Tree a = Leaf a | Node (Tree a) (Tree a) deriving (Show, Eq)

-- Arbitrary instance
instance Arbitrary a => Arbitrary (Tree a) where
    arbitrary = do left <- arbitrary
                   right <- arbitrary
                   x <- arbitrary
                   r <- elements [Leaf x, Node left right]
                   return r


{-

 We generate trees of Ints.
 Hence, the type annotation for genTree.

 We make use of the following Test.QuickCheck functions

vectorOf :: Int -> Gen a -> Gen [a]

generate :: Gen a -> IO a

vectorOf generates arbitrary values for a certain number.

Test.QuickCheck runs in its own monad (Gen).

Via generate, we can receive the Test.QuickCheck generated value within the IO monad.

-}

genTrees :: Int -> IO ()
genTrees n = do ts <- generate $ vectorOf n genTree
                print ts
             where
                 genTree :: Gen (Tree Int)
                 genTree = arbitrary

Test.QuickCheck - Regular expresssions

Check that our simplifyer is correct.

prop :: (String, R) -> Bool
prop (xs,r) = membership xs r == membership xs (simp r)

Generate arbitrary regular expressions.

instance Arbitrary R where
  arbitrary =
    do l <- arbitrary
       r <- arbitrary
       x <- arbitrary
       r <- elements (nTimes 2 (Sym x)
                     ++ nTimes 1 (Epsilon)
                     ++ nTimes 1 (Star r)
                     ++ nTimes 3 (Conc r l)
                     ++ nTimes 3 (Alt r l))
       return r

    where
       nTimes n x = take 4 (repeat x)

Complete source code


import Test.QuickCheck

{-

EBNF Syntax of regular expressions and words

 R ::= 'x' | 'y' | ...              alphabet symbols
     | epsilon                      represents the empty string
     | phi                          represents the empty language
     | R + R                        alternatives
     | R . R                        concatenation
     | R*                           Kleene star
     | (R)                          Grouping via parentheses


 W ::= epsilon | 'x' | 'y' | ...
   | W W

-}

-- Data type to represent regular expressions.
data R = Phi
       | Epsilon
       | Sym Char
       | Alt R R
       | Conc R R
       | Star R deriving Eq  -- automatically derive equality among regular expressions

{-
-- The following instance can be derived automatically.
instance Eq R where
   (==) Phi Phi = True
   (==) Epsilon Epsilon = True
   (==) (Sym x) (Sym y) = x == y
   (==) (Alt r1 r2) (Alt s1 s2) = r1 == s1 && r2 == s2
   (==) (Conc r1 r2) (Conc s1 s2) = r1 == s1 && r2 == s2
   (==) (Star r) (Star s) = r == s
-}

-- Show instance for regular expressions
instance Show R where
  show Phi = "PHI"
  show Epsilon = ""
  show (Sym x) = [x]
  show (Alt r s) = "(" ++ show r ++  "+" ++ show s ++ ")"
  show (Conc r s) = "(" ++ show r ++  "." ++ show s ++ ")"
  show (Star s) = "(" ++ show s ++ ")" ++ "*"


-- some regular expressions plus short-hands
x = Sym 'x'
y = Sym 'y'

r1 = Star x
r2 = Star $ Alt x y
r3 = Conc y r2
r4 = Alt (Conc Epsilon r1) r1

-- Epsilon or r.
opt :: R -> R
opt r = Alt r Epsilon

-- One or more iterations.
kleenePlus :: R -> R
kleenePlus r = Conc r (Star r)


-----------------------------------------------------------------------
-- Writing functions that operate on data types via pattern matching.


-- Yield True if epsilon is part of the language, otherwise, we find False.
nullable :: R -> Bool
nullable Phi = False
nullable Epsilon = True
nullable Sym{} = False
-- spelled out as the following.
-- nullable (Sym _) = False
nullable (Alt r s) = nullable r || nullable s
nullable (Conc r s) = nullable r && nullable s
nullable Star{} = True

-- Simplifying expressions:
-- epsilon . R = R
-- R + R = R
--
-- We build a "fixpoint"!
-- We apply helper simp2 until no further simplifications can be applied.
-- Why is this necessary? Hint. Consider example r4.
simp :: R -> R
simp r = let s = simp2 r
         in if s == r then r
            else simp s
     where
        simp2 (Conc Epsilon r) = simp2 r
        simp2 (Conc r s) = Conc (simp2 r) (simp2 s)
        simp2 (Alt r s)
           | r == s    = simp2 r
           | otherwise = Alt (simp2 r) (simp2 s)
        simp2 (Star r) = Star $ simp2 r
        simp2 r = r


------------------
-- https://en.wikipedia.org/wiki/Brzozowski_derivative


deriv :: Char -> R -> R
deriv _ Epsilon = Phi
deriv _ Phi = Phi
deriv x (Sym y)
  | x == y    = Epsilon
  | otherwise = Phi
deriv x (Alt r s) = Alt (deriv x r) (deriv x s)
deriv x (Conc r s)
  | nullable r = Alt (Conc (deriv x r) s) (deriv x s)
  | otherwise =  Conc (deriv x r) s
deriv x (Star r) = Conc (deriv x r) (Star r)


membership :: String -> R -> Bool
membership xs r =
          go r xs
             where
               go r [] = nullable r
               go r (x:xs) = go (deriv x r) xs


xs = Star (Sym 'x')

test0 = membership "xxxx" xs

aOrBs = Star (Alt (Sym 'a') (Sym 'b'))


test1 = membership "abba" aOrBs

test2 = membership "abbca" aOrBs



-----------------------------
-- Arbitrary regular expresssions


instance Arbitrary R where
  arbitrary =
    do l <- arbitrary
       r <- arbitrary
       x <- arbitrary
       r <- elements (nTimes 2 (Sym x)
                     ++ nTimes 1 (Epsilon)
                     ++ nTimes 1 (Star r)
                     ++ nTimes 3 (Conc r l)
                     ++ nTimes 3 (Alt r l))
       return r

    where
        nTimes n x = take 4 (repeat x)


-----------------------------
-- Properties


prop :: (String, R) -> Bool
prop (xs,r) = membership xs r == membership xs (simp r)



test = quickCheck prop

QuickCheck: Automatic Testing

Overview

Motivating example - counting words

User tests

Systematic testing

Some test cases

Observations

Complete source code

QuickCheck - The idea

Properties

Test data generation

Type-specific test data generation

Generating arbitrary values for specific types

The Arbitrary type class

Generators

Arbitrary instances

Application: Generating arbitrary strings

Property-based testing

Summary

Complete source code

Test.QuickCheck

Complete source code

Test.QuickCheck - Customizing test data generation

From MyString to String and back

MyString-String conversions

MyString wrappers

Complete source code

Test.QuickCheck - Test data generation for trees

Test.QuickCheck - Regular expresssions

Complete source code

Summary