Haskell/Understanding monads/State

If you have programmed in any other language before, you likely wrote some functions that "kept state". For those new to the concept, a state is one or more variables that are required to perform some computation but are not among the arguments of the relevant function. Object-oriented languages like C++ make extensive use of state variables (in the form of member variables inside classes and objects). Procedural languages like C on the other hand typically use global variables declared outside the current scope or static variables in the functions to keep track of state.

In Haskell, however, such techniques are not as straightforward to apply. Doing so will require mutable variables which would mean that functions will have hidden dependencies, which is at odds with Haskell's functional purity. Fortunately, often it is possible to keep track of state in a functionally pure way. We do so by passing the state information from one function to the next, thus making the hidden dependencies explicit.

The  type is designed to simplify this process of threading state through functions. In this chapter, we will see how it can assist us in some typical problems involving state: modelling a state machine and generating pseudo-random numbers.

State Machine
We will model a simple finite-state machine based on a coin-operated turnstile. Our model will be enhanced so that, in any state, it will create an output (in addition to a state transition) for each input.

Turnstile Example
The finite-state model of our turnstile is shown in this state-transition diagram:



The turnstile has two states: Locked and Unlocked. (It starts in a Locked state). There are two types of input: Coin (corresponding to someone putting a coin in the slot) and Push (corresponding to someone pushing the arm). Each input causes an output (Thank, Open or Tut) and a transition to a (new, or maybe the same) state. If someone puts a coin in when the turnstile is locked, they are thanked (yes, it can talk!) and the turnstile becomes unlocked. If they add more coins, they are thanked more but get no benefit (the turnstile simply remains unlocked with no memory of how many additional coins have been added). If someone pushes the arm when the turnstile is unlocked, the arm will open to let them through, then become locked to prevent anyone else going through. If someone pushes the arm when the turnstile is locked, it will politely tut at them but not let them through and remain locked.

Basic Model in Haskell
We will represent the states and outputs as follows:

But what about the inputs? We can model them as functions. Here's a first attempt:

These correctly return the output for each input in any state, but don't give any indication of the new state. (In an imperative program, these "functions" might also update a variable to indicate the new state, but that is not an option in Haskell, nor, we claim, desirable). The answer is easy and obvious: return the new state along with the output:

Sequencing Steps
How can we use this? One way is to list the set of outputs resulting from a sequence of inputs:

Note that, like  and , this   function takes an initial state as a parameter and returns the final state alongside the list of outputs.

From the examples it can be seen that:
 * the state (of some fixed type) is always passed from step to step, and (usually) included in the input to and output from a function: having the state as input to and output from functions allows us to chain them together as steps in bigger functions, in the same way the steps within these smaller functions are chained together;
 * the (non-state) output of a step may or may not be used in deciding subsequent steps:  uses the output of the first   to determine whether they need to insert a coin, but   always does the same two steps regardless of their outputs;
 * the (non-state) output of a step may or may not be used in the final (non-state) return value from a function: the return value from  uses the output from each step, but the return value from   doesn't depend on the output from the first step;
 * a function may take parameters in addition to the initial state:  takes a   parameter;
 * the types of the (non-state) return values can be different for different functions:  returns ,   returns   and   returns.

But all of this code is cumbersome, tedious to write and error prone. It would be ideal if we could automate the extraction of the second member of the tuple (i.e. the new state) and feed it to the next step, whilst also allowing the function to use the (non-state) values to make decisions about further steps and/or include in the (non-state) result(s). This is where  comes into the picture.

Introducing
The Haskell type  describes functions that consume a state and produce both a result and an updated state, which are given back in a tuple.

The state function is wrapped by a data type definition which comes along with a  accessor so that pattern matching becomes unnecessary. For our current purposes, the  type might be defined as:

Here,  is the type of the state, and   the type of the produced result. Calling the type  is arguably a bit of a misnomer because the wrapped value is not the state itself but a state processor.

newtype
Note that we defined the data type with the  keyword, rather than the usual. can be used only for types with just one constructor and just one field. It ensures that the trivial wrapping and unwrapping of the single field is eliminated by the compiler. For that reason, simple wrapper types such as  are usually defined with. Would defining a synonym with  be enough in such cases? Not really, because  does not allow us to define instances for the new data type, which is what we are about to do...

Where did the constructor go?
In the wild, the  type is provided by   in   (and is also reexported by   in  ). When you use it, you will quickly notice there is no  constructor available. The  package implements the   type in a different way. The differences do not affect how we use or understand, except that instead of a   constructor,   exports a   function,

which does the same job. As for why the actual implementation is not the obvious one we presented above, we will get back to that a few chapters down the road.

Instantiating the Monad
So far, all we have done was to wrap a function type and give it a name. There is another ingredient, however: for every type,   can be made a   instance, giving us very handy ways of using it.

To define a  instance, there must also be instances for   and. As we explained previously, these superclass instances can be derived as follows from the  instance that we are about to define in more detail.

In a later section we will discuss the implications of  also being a   and an   in more detail. You will also get a chance to reimplement the above explicitly based on their behaviour, without simply deferring to the  instance.

So let's define this instance.

Note the instance is, and not just   on its own;   can't be made an instance of  , as it takes two type parameters, rather than one. That means there are actually many different  monads, one for each possible type of state - ,  ,  , and so forth. However, we only need to write one implementation of  and  ; the methods will be able to deal with all choices of.

The  function is implemented as:

Giving a value to   produces a function which takes a state  and returns it unchanged, together with the value we want to be returned. As a finishing step, the function is wrapped up with the  function.

As for binding, it can be defined like this:

We wrote the definition above in a quite verbose way, to make the steps involved easier to pinpoint. A more compact way of writing it would be:

is given a state processor and a function  that is used to create another processor from the result of the first one. The two processors are combined into a function that takes the initial state and returns the second result and the third state (i.e. the output of the second processor). Overall,  here allows us to run two state processors in sequence, while allowing the result of the first stage to influence what happens in the second one.



One detail in the implementation is how  is used to undo the   wrapping, so that we can reach the function that will be applied to the states. The type of, for instance, is.

Understanding the Bind Operator
Another way to understand this derivation of the bind operator  is to consider once more the  explicit but cumbersome way to simulate a stateful function of type   by using functions of type , or, said another way:. These classes of functions pass the state on from function to function. Note that this last signature already suggests the right-hand side type in a bind operation where the abstract type is.

Now that we have seen how the types seem to suggest the monadic signatures, let's consider a much more concrete question: Given two functions  and , how do we chain them  to produce a new function that passes on the intermediate state?

This question does not require thinking about monads: one option is to simply use function composition. It helps our exposition if we just write it down explicitly as a lambda expression:

Now, if in addition to chaining the input functions, we find that the functions of signature  were all wrapped in an abstract datatype , and that therefore we need to call some other provided functions , and   in order to get to the inner function, then the code changes slightly:

This code is the implementation of  shown above, with   and ,  so we can now see how the  definition of bind given earlier is the standard function composition for this special kind of stateful function.

This explanation does not address yet where the original functions  and   come from in the first place, but they do explain what you can do with them once you have them.

Turnstile using
We now look at how the  type can help with the turnstile example. Firstly, by comparing the type of  with , we can see that, by replacing   with   and   by   we can define:

We can then use  to extract the underlying functions and apply them to a state, for example:

Using the Turnstile monad
Not yet too exciting, but now  and   are monadic (they are functions — admittedly of zero parameters — that return Monad instances) we can do monadic stuff with them, including using   notation:

Note that we're no longer writing all the code to thread the output state from each step into the next: the  monad is doing that for us. A lot of the tedious and error-prone work has been removed. How? Remember that  is simply syntactic sugar for the bind   operator so the above is equivalent to:

This uses the  operator we defined for   above, unwraps each state-processing function from its   wrapper, applies the output state from it as an argument into the next step, and wraps the result back in a   wrapper. The sequence of  operators, along with   combines all the steps into a single combined state-processing function wrapped in a   wrapper, which we can access and run with.

A monad is sometimes described as providing a value in a context. An  monad can provide values from the real world when we ask it to. A  monad can provide values if it's there, or not otherwise. What about the  monad? It can provide a value when we execute a step of a state-processor. (And the monad "automatically" ensures that state changes are passed from step to step without us having to worrying about it).

In this example, some tedium remains in obtaining the list of outputs from each step and combining them into a list. Can we do better? Yes we can:

We met  in the section on IO Monads. It creates a single action (in this case a state processing function) which, when executed, runs through each of the actions (in this case state processing steps) in turn, executing them and combining the results into a list.

and
We have seen how  accesses the state processing function so that we can do, for example,. (We also used it in the definition of .)

Other functions which are used in similar ways are  and. Given a  and an initial state, the function   will give back only the result value of the state processing, whereas   will give back just the new state.

OK, they're not much. But they're not nothing, and they allow us to do e.g.:

if we only want to see the output sequence, and not the final state.

Setting the State
What if we had an turnstile engineer who wanted to test the locking mechanism with code like this:

This handy function comes to the rescue:

is a monadic function that can be bound with  operators or fit in do constructs in sequence with other actions. It takes a state parameter (the one we want to introduce) and generates a state processor which ignores whatever state it receives and gives back the new state we introduced as the next state. Since we don't care about the result of this processor (all we want to do is to replace the state), the first element of the tuple will be, the universal placeholder value.

Let's see how it helps the engineer:

Accessing the State
In the definition of  above we made use of the existing code. What if we wanted to write it without such pre-existing function? Obviously we could do this:

but could we write it all using a do construct? Yes, using this:

is also monadic and creates a state processor that gives back the state  it is given both as a result and as the next state. That means the state will remain unchanged, and that a copy of it will be made available for us to use.

We could use  like this:

Monadic Control Structures
The second version of  above shows another benefit of using the monad, in addition to the hiding of the state threading and ability to use do notation and the like: we are also able to use great functions like. In this section we look at some more of these functions. (You will need to, or do   to ensure all of these are in scope).

First, here's :

Which is pretty self-explanatory.

Before we look at any more, we need a slightly different (arguably better) implementation of the turnstile finite-state machine, using an input type and a single processing function:

We can now use, like this:

This very nicely illustrates how the finite-state machine is a transducer: it converts an ordered sequence of inputs to an ordered sequence of outputs, maintaining the state as it goes along.

Now we'll look at :

We can see two people made it through (not surprisingly, when they pushed the arm). If we switch the order of the first two inputs more people get through:

Here's a different way of counting the number of openings using :

Note that,   and   always execute all of the actions in the input list, but   could skip some.

Pseudo-Random Numbers
Suppose we are coding a game in which at some point we need an element of chance. In real-life games that is often obtained by means of dice or similar. For a computer program we need something to emulate such an object, and most programming languages provide some concept of random numbers that can be used for this purpose.

Generating actual random numbers is hard. Computer programs almost always use pseudo-random numbers instead. They are "pseudo" because they are not actually random, and that they are known in advance. Indeed, they are generated by algorithms (the pseudo-random number generators) which take an initial state (commonly called the seed) and produce from it a sequence of numbers that have the appearance of being random. Every time a pseudo-random number is requested, state somewhere must be updated, so that the generator can be ready for producing a fresh, different random number the next time. Sequences of pseudo-random numbers can be replicated exactly if the initial seed and the generating algorithm are known.

Haskell Global Pseudo-Random Number Generator
Producing a pseudo-random number in most programming languages is very simple: there is a function somewhere in the libraries that provides a pseudo-random value (and also updates an internal mutable state so that it produces a different value next time, although some implementations perhaps produce a truly random value). Haskell has a similar one in the  module from the   package:

What is ? It's the class of types that can have pseudo-random values generated by the  module functions. ,,   and others are all instances of. You can "request" any of these by specifying the result type:

More interestingly,  is an   action. It couldn't be otherwise, as it makes use of mutable state, which is kept out of reach from our Haskell programs. Thanks to this hidden dependency, the pseudo-random values it gives back can be different every time.

However, we're here to study the  monad, so let's look at functions that take and return an explicit representation of the random number generator state.

Haskell Pseudo-Random Number Generator with Explicit State
Here's a slightly different function in the  module:

Now there's no, and we should recognise the   pattern as something we could put inside a   wrapper.

What is ? It is another class defined in the  module. The module also provides a single instance. There are a couple of ways to create values of this type. The one we will use first is  which creates a   value from a given seed:

Note that, given the same seed, you get the same StdGen. What is ? The documentation calls it "the standard pseudo-random number generator", but it might be better to call it the state of the standard pseudo-random number generator. We can see that here:

The first function returns an initial state, based on a given seed of 666. The second function takes the initial   state and returns a pair: a random value (we've requested an  ) and a new   state. How is the state represented internally? The  module does it somehow, and we don't really care how. (We can see  implements , which displays two funny numbers. We could go and look at the source code if we really wanted to see how it works, but some clever person might go and change it one day anyway). How does  calculate a new state? We also don't care; we can just be happy that it does.

Example: Rolling Dice
We are going to build a dice-throwing example. And for this, we'll use a slightly different function:

takes a range (in this case 1 to 6) and returns a pseudo-random number in the range (we were lucky: we got a 6!).

Suppose we want a function that rolls two dice and returns a pair representing the result of each throw. Here's one way:

Doesn't this remind us of the tedious and error-prone approach we first tried in the turnstile example? Not convinced it's tedious? Try the first exercise:

Dice with
So, a better way, using :

This is very similar to the original versions of  and  : there was already a function of form , and we just wrapped in in a   wrapper. Now we have monadic power! We can write:

And we avoid all the tedious threading of state from one step to the next.

is also a and an
Here's another dice throwing function:

Its behaviour should be clear. But it seems a bit verbose for such a simple function. Can we do better?

As we noted previously (and saw above),  (and all other monads) are also instances of   and. And in the prologue we did:

This leveraged the fact that  is a. The  converts a   to   (or   to  ). If we think of  as a value wrapped in a context, we can see that the   has kept the same context (it's still a , or still  ), but applied a conversion to the wrapped value. We can do the same with the  functor:

The meaning of  is different to the meaning of   (it's the output of a state-processing step, not a possibly-existing value), but we've applied the same conversion to the wrapped value. Now, when we unwrap the value from  we get double what we would have got had we unwrapped.

Suppose we also wanted ? In the prologue section did, and again we can do something similar:

This code depends on  being also an , but not on it being a. It will ensure each of the  actions is executed in order, and chain the state correctly between them. It will then repackage the combination as a state-processing function wrapped in a  wrapper. The combined function will return the sum of the two successive throws (and also, but quietly, ensure the state is added as an input parameter and an output value).

The  module provides a function  :

Using this, it  could be defined as:

More is said later on the relationship between,   and  , and choosing which one to use.

Pseudo-random values of different types
We saw that  can return a value of a type other than. So can its -free equivalent.

Because  is "agnostic" in regard to the type of the pseudo-random value it produces, we can write a similarly "agnostic" function that provides a pseudo-random value of unspecified type (as long as it is an instance of  ):

Compared to, this function does not specify the   type in its signature and uses   instead of  ; otherwise, it is just the same. can be used for any instance of :

Indeed, it becomes quite easy to conjure all these at once:

For writing, there is no  , and so we resort to plain old   instead. Using it, we can apply the tuple constructor to each of the seven random values in the  monadic context.

provides pseudo-random values for all default instances of ; an additional   is inserted at the end to prove that the generator is not the same, as the two  s will be different.

(Probably) don't or
In the turnstile example, we used  to set the state and   to access it. Can we do the same here? Well, yes we can, but there's probably no need.

In that example, we had to code functions (like ) that used the current state to determine outputs and new states, or (like  ) that set required states as part of a processing sequence. With our random number examples, all generation, inspection and update of the  state is done internally within the   module, without us having to know how.

Now, in our first implementation of  we were aware of the   state: we took it as a parameter, threaded it through the successive steps and returned the final state. If we really wanted to use the value (perhaps we wanted to put it in a debugging message using ) then we did have the opportunity. And, with our  monad we still do. The following shows one usage:

This does spell out all of the steps the  monad takes for us, but would be a rather perverse implementation since the whole point of the monad is so we don't have to spell them out.

Better Random Numbers
Other than our initial use of, all of the above examples have used  , and all with the same seed value 666. This would make for a pretty boring game, where exactly the same dice were rolled each time. (Though this might be useful, e.g. when testing your program.) How can we get better random numbers? Like this:

is (effectively) defined as. It is an IO action that spawns a new random state from the same global random state used by. It also updates that global state, so that further uses of  give a different value.

So, aren't we back a square one, being dependent on ? No we're not. We have gained all the power of the  monad to build up chains of dice-rolling steps which we can assemble into bigger and bigger state-transformation functions. We can do all of that without. In the turnstile example, we didn't need  at all (although we probably would if we wanted to put our code into some kind of application), and for some uses of random numbers, having the same numbers each time might be beneficial. We only needed  to get "really random" numbers, and we may well need   only once in a program. Chances are that it would be alongside other  actions, for example:

Handling Combined States
Suppose we wanted to create a random turnstile, where each visitor would be given a random turnstile input: either they insert a coin (but are not allowed through); or they get to push the arm (and go through if it opens, but are otherwise sent away).

Here's one useful bit of code:

This allows us to generate random  values. However, our random turnstile machine needs to track both the state of a random number generator and the state of the turnstile. We want to write a function like this:

And this function needs to call both  (which is in the   monad) and   (which is in the   monad).

Much of the code in  deals with managing the state: accessing the combined state, unpacking subcomponents, forwarding them to the individual State monads, recombining them and putting the combined state back. The state management code is not too bad in this case, but could easily become cumbersome in a more complex function. And it is something we wanted the  monad to hide from us.

State-Processing a Subcomponent
Ideally we'd want some utility function(s) that allow us to invoke a  monad function (or   monad function) from within a   monad function. These function(s) should take care of the state management for us, ensuring that the right subcomponent of the combined state is updated.

Here's one such a function that works for any combined state represented as a pair, and performs the state update on the fst of the pair:

Note the type:

"converts" a  monad (in this case with state type  ) to another   monad (in this case with state type , where   can be any type, even a  ).

Note how  is no longer directly involved in the details of the state management, and its business logic is much more apparent.

Generic Subcomponent Processing
We can see that  and   are very similar. They both extract a subcomponent of a combined state,  on that subcomponent, then update the combined state with the new value of the subcomponent.

Let's combine them into a single generic subcomponent processing function. To do this, we could pass in separate parameters, one of type  (a function that extracts a subcomponent from a combined state value), and another of type   (a function that, given a combined value and a new value for a subcomponent, returns the revised combined value with the updated subcomponent). However, it's a bit neater to package these two functions together in a type which we'll call :

We can provide specific lenses onto the fst and snd elements in a pair:

So now:

We can now replace  and   with our generic function.

Our final random turnstile code is neater, with three separate logical functions segregated:
 * state management (now in a single  utility function, which can be reused elsewhere);
 * subcomponent accessing and update (using, which can also be reused elsewhere .); and
 * the "business logic" of the turnstile, which is now very apparent.

In our first implementation, all three of these were muddled together.

Let's give it a go:

I'm not sure we'll sell many of them, though.