Software Transactions

Martin Sulzmann

The problem with locks

Consider the following code snippet.

transfer (from Account, to Account, amount int) {
    if (from.balance() >= amount && amount < Max) {
      from.withdraw(amount)
      to.deposit(amount)
    }
} 

There is a potential data race. Two concurrent transfers may lead to inconsistent data. We need to guarantee that withdraw and deposit are executed atomically.

Locks to the rescue

transfer (from Account, to Account, amount int) {
 lock()
    if (from.balance() >= amount && amount < Max) {
      from.withdraw(amount)
      to.deposit(amount)
    }
  unlock()
} 

All good? The issue is that any concurrent transfer operations will be executed sequentially, even if the accounts involved differ!

Fine-grained locking

Instead of a global lock for all accounts, we introduce a lock for each account.

transfer (from Account, to Account, amount int) {
    if (from.balance() >= amount && amount < Max) {
      from.lock()
      to.lock()
      from.withdraw(amount)
      to.deposit(amount)
      to.unlock() 
      from.unlock()      
    }
} 

So, two concurrent transfers operating on different accounts can possibly be executed in parallel. However, there's a subtle (deadlock) problem. Can you see the problem?

Consider the following situation.

go transfer(from,to,50) // T1

transfer(to,from,50) // T2

There's a cycle!

Locking order

We impose a strict order in which fine-grained locks are acquired.

transfer (from Account, to Account, amount int) {
    if (from.balance() >= amount && amount < Max) {
      if from.lock.order() > to.lock.order() {
           from.lock()
           to.lock()
           from.withdraw(amount)
           to.deposit(amount)
           to.unlock()       
           from.unlock()
      } else {
           to.lock()
           from.lock()
           from.withdraw(amount)
           to.deposit(amount)
           from.unlock()       
           to.unlock()
      }
    }
} 

Phew! Problem solved. But it gets really tricky. Too much to handle for mere mortals.

There's yet another issue.

Fine-grained locks are not composable.

Software transactional memory (STM)

Idea:

transfer (from Account, to Account, amount int) {
  atomically {
    if (from.balance() >= amount && amount < Max) {
      from.withdraw(amount)
      to.deposit(amount)
    }
  }
} 

The cool thing about STM is that it's damn easy to use. Like garbage collection (GC) is damn easy to use. It just works!

So, are all typicall concurrency problems solved by using STM?

STM implementations

Pessimistic Approach

"Stop the world" by simply using a global lock! See our bank transfer example.

In general this is bad because all concurrent transactions will be forced to execute sequentially.

Optimistic approach

  1. Use a transaction log to record reading and writing of shared variables.

    • Upon first read, a variable is copied into the transactional log. Any further reads/writes operate on that copy.
  2. Validate that reads are still the same.

    • Compare the copy in the transaction log with the copy in the store. If there are differences perform a rollback.
  3. Commit any writes from the transaction log to the memory store.

The transaction log guarantees that two concurrent transactions can operate in isolation. If there are no interferences both transactions can be executed in parallel.

For consistency, it's critical that the validation and commit step do not lead to interferences with other concurrent validates and commits. There is quite a bit of design space how this in detail is dealt with.

A possible (fairly optimistic) approach is

Possible improvements are:

Lazy verus eager versioning

There are two strategies to perform the 'commit' step.

STM short summary

// Example 1
atomically {
  ...
  fireMissile();
}

What happens in case of a rollback? All modifications to (transactional) variables are rolled back. This is possible because updates only took place in the transactional log. What about fireMissile()? Such side-effecting operations will be carried out immediately! In case of several rollbacks, we will call fireMissile() multiple times!

//thread 1
atomically {
 ...
 x = x + 1
 ...
}

// thread 2
y = x

// thread 3
x++

The questions here are as follows. Shall transactions be atomic with respect to other transactions and non-transactional code? Or shall transactions only be atomic with respect to other transactions? The first choice is referred to as strong atomicity whereas the second choice is referred to as weak atomicity.

Next, we will take a look at STM in Haskell. The good news is that by construction there cannot be any "conflicts" between transactional and non-transactional code.

STM in Haskell

In-place update in Haskell

C versus Haskell.

int x = 1;
x = x + 1;
x <- newIORef (1::Int)
v <- readIORef x
writeIORef x (v+1)

Coding in-place updates in Haskell feels like Assembler (three-address code).

True. It would be nice to have some better syntactic sugar (sometimes possible via EDSLs).

However, the advantage of the Haskell versus the C version is that we can make precise statements about the side effects that take place.

Here is the in-place update interface in Haskell.

newIORef :: a -> IO (IORef a)
readIORef :: IORef a -> IO a
writeIORef :: IORef a -> a -> IO ()

Haskell STM interface

module Control.Concurrent.STM
data STM a
data TVar a
newTVarIO  :: a -> IO (TVar a)
newTVar :: a -> STM (TVar s)
readTVar :: TVar a -> STM a
writeTVar :: TVar a -> a -> STM ()
atomically :: STM a -> IO a
retry      :: STM a
orElse :: STM a-> STM a-> STM a 
fireMissile :: IO ()


someCode :: IO ()
 atomically $
    do ...
       fireMissile
       ...

Bank transfer revisited

transfer :: TVar Int -> TVar Int -> Int -> IO ()
transfer fromAcc toAcc amount =
 atomically $ do
   f <- readTVar fromAcc
   if f <= amount
     then return ()
     else do
       writeTVar fromAcc (f - amount)
       t <- readTVar toAcc
       writeTVar toAcc (t + amount)

STM Transactions are composable

Consider the following.

transferSTM :: TVar Int -> TVar Int -> Int -> STM ()
transferSTM fromAcc toAcc amount = do
   f <- readTVar fromAcc
   if f <= amount
     then return ()
     else do
       writeTVar fromAcc (f - amount)
       t <- readTVar toAcc
       writeTVar toAcc (t + amount)

transferAtomicFourAccounts :: TVar Int -> TVar Int -> Int 
                            -> TVar Int -> TVar Int -> Int -> IO ()
transferAtomicFourAccounts from1 to1 a1 from2 to2 a2 = do
 atomically $ do
   transferSTM from1 to1 a1
   transferSTM from2 to2 a2

STM retry and orElse

Bank account - retry

transfer :: TVar Int -> TVar Int -> Int -> IO ()
transfer fromAcc toAcc amount =
 atomically $ do
   f <- readTVar fromAcc
   if f <= amount
     then retry
     else do
       writeTVar fromAcc (f - amount)
       t <- readTVar toAcc
       writeTVar toAcc (t + amount)

Bank account - orElse

transfer2 :: TVar Int -> TVar Int -> Int -> STM ()
transfer2 fromAcc toAcc amount = 
 (do f <- readTVar fromAcc
     if f <= amount
      then retry
      else do
       writeTVar fromAcc (f - amount)
       t <- readTVar toAcc
       writeTVar toAcc (t + amount) )
 `orElse`
    transfer2 fromAcc toAcc (amount-50) 

Semaphore via STM

type Sem = TVar Int

newSem :: Int -> IO Sem
newSem n = newTVarIO n

p :: Sem -> STM ()
p sem = do n <- readTVar sem
           if n > 0
             then writeTVar sem (n-1)
             else retry

v :: Sem -> STM ()
v sem = do n <- readTVar sem
           writeTVar sem (n+1)