One of the things I recently came across in F# is the MailboxProcessor. It struck my interest and I wanted to take some time to blog about it here.
The Problem With Mutable State
Almost all of our programs and applications contain some type of state. In a multi threaded environment, storing state can lead to problems.
For a simple example, lets count from 0 to 1000 in parallel.
We initialize a counter (line 1), create an async operation to update that counter (lines 2-5), then replicate it 1000 times (line 8), run in parallel (line 9), wait for all to finish (line 10) then print the value (line 13).
Care to guess what the output is? If you’re familiar with concurrency issues the answer is ???. I ran this code three times and got these values (894, 916, 880).
Why is that? We have 1000 replications of updateCounterAsync. Since the counter update isn’t protected, some are reading counter while others are writing counter. This leads to some of the updates get overridden.
That’s a problem, here’s two ways to correct it.
One way to fix this is to introduce a lock. Which is what I naturally gravitate towards.
Locks work by blocking access for anyone else, that ensures that only one thread at a time is updated counter.
Now that the
counter update is locked, running this always outputs 1000, which is the answer that we’re looking for.
Locking is great, but since this is an article about MailboxProcessor, lets demonstrate that.
The MailboxProcessor handles concurrency a different way. It is a dedicated message queue running on its own logical thread of control1.
MailboxProcessor deals with messages. Message that are sent to it are queued until MailboxProcessor is ready to process them. This guarantees messages are processed in order and won’t overwrite each other.
Here is what a counter would look like using the MailboxProcessor.
We have two different messages,
Fetch. The MailboxProcessor runs a function recursively, the first thing it does when it enters the function is receive a message.
If it receives an
Incr message, it adds one to
n and calls itself again. If it receives a
Fetch message, it sends back the current value of the counter.
We want to hide the internal details, so interaction with our MailboxProcessor is controlled by our two methods,
Notice there’s a timeout on
Fetch(), so it’s not guaranteed to always return the value. Which is a complexity and a reason not to use MailboxProcessor.
Since we’re dealing with messages now, we also need some slight updates to our calling code.
Which technique is better? Personally, I would choose the locking approach. That’s mainly from my experience with other languages. This is the first time exploring MailboxProcessor, so maybe my mind will change in the future if I continue to use it.