Haskell/Monoids

In earlier parts of the book, we have made a few passing allusions to monoids and the  type class (most notably when discussing MonadPlus). Here we'll give them a more detailed look and show what makes them useful.

What is a monoid?
The operation of adding numbers has a handful of properties which are so elementary we don't even think about them when summing numbers up. One of them is associativity: when adding three or more numbers it doesn't matter how we group the terms.

Another one is that it has an identity element, which can be added to any other number without changing its value. That element is the number zero:

Addition is not the only binary operation which is associative and has an identity element. Multiplication does too, albeit with a different identity.

We needn't restrict ourselves to arithmetic either. , the appending operation for Haskell lists, is another example. It has the empty list as its identity element.

It turns out there are a great many associative binary operations with an identity. All of them, by definition, give us examples of monoids. We say, for instance, that the integer numbers form a monoid under addition with  as identity element.

The class
Monoids show up very often in Haskell, and so it is not surprising to find there is a type class for them in the core libraries. Here it is:

The  method is the binary operation, and   is its identity. The third method,, is provided as a bonus; it runs down a list and  s its elements together in order.

"mappend" is a somewhat long and unwieldy name for a binary function so general, even more so for one which is often used infix. Fortunately,    provides , a convenient operator synonym for. In what follows, we will use  and   interchangeably.

As an example, this is the monoid instance for lists:

Note that, in this case,  is equivalent to , which explains the name of the method.

It is legitimate to think of monoids as types which support appending in some sense, though a dose of poetic licence is required. The  definition is extremely general and not at all limited to data structures, so "appending" will be just a metaphor at times.

As we suggested earlier on, numbers (i.e. instances of ) form monoids under both addition and multiplication. That leads to the awkward question of which one to choose when writing the instance. In situations like this one, in which there is no good reason to choose one possibility over the other, the dilemma is averted by creating one  for each instance:

Here is a quick demonstration of  and  :

laws
The laws which all instances of  must follow simply state the properties we already know:   is associative and   is its identity element.

Uses
Which advantages are there in having a class with a pompous name for such a simple concept? As usual in such cases, the key gains are in two associated dimensions: recognisability and generality. Whenever, for instance, you see  being used you know that, however the specific instance was defined, the operation being done is associative and has an identity element. Moreover, you also know that if there is an instance of  for a type you can take advantage of functions written to deal with monoids in general. As a toy example of such a function, we might take this function that concatenates three lists..

... and replace all  with  ...

... thus making it work with any. When used on other types the generalised function will behave in an analogous way to the original one, as specified by the monoid laws.

Monoids are extremely common, and have many interesting practical applications.


 * The  monad : A computation of type   computes a value of type   while producing extra output of type   which must be an instance of , and the bind operator of the monad uses   to accumulate the extra output. A typical use case would be logging, in which each computation produces a log entry for later inspection. In the logging use case, that would mean all entries generated during a series of computations are automatically combined into a single log output.


 * The  class : Monoids play an important role in generalising list-like folding to other data structures. We will study that in detail in the upcoming chapter about the   class.


 * Finger trees : Moving on from operations on data structures to data structure implementations, monoids can be used to implement finger trees, an efficient and versatile data structure. Its implementation makes use of monoidal values as tags for the tree nodes; and different data structures (such as sequences, priority queues, and search trees) can be obtained simply by changing the involved  instance.


 * Options and settings: In a wholly different context, monoids can be a handy way of treating application options and settings. Two examples are Cabal, the Haskell packaging system ("Package databases are monoids. Configuration files are monoids. Command line flags and sets of command line flags are monoids. Package build information is a monoid.") and XMonad, a tiling window manager implemented in Haskell ("xmonad configuration hooks are monoidal.") . Below are snippets from a XMonad configuration file (which is just a Haskell program) showing the monoidal hooks in action.


 * : The package provides a powerful library for generating vectorial images programatically. On a basic level,   appears often in code using   because squares, rectangles and other such graphic elements have   instances which are used to put them on the top of each other. On a deeper level, most operations with graphic elements are internally defined in terms of monoids, and the implementation takes full advantage of their mathematical properties.

Homomorphisms
Given any two monoids  and , a function   is a monoid homomorphism if it preserves the monoid structure, so that:

In words,  takes   to , and the result of   for   to the result of   for   (after using   to turn the arguments to   into   values).

As an example,  is a monoid homomorphism between  and :

When attempting to determine if a given function is a homomorphism do not concern yourself with the actual implementation; although its definition clearly influences whether or not it is a homomorphism, a homomorphism is defined by a function's ability to preserve the operations of the two underlying structures involved in the mapping and is not directly tied to implementation details.

An interesting example "in the wild" of monoids and homomorphisms was identified by Chris Kuklewicz amidst the Google Protocol Buffers API documentation. Based on the quotes provided in the referenced comment, we highlight that the property that (in C++):

... is equivalent to...

... means that  is a monoid homomorphism. In a hypothetical Haskell implementation, the following equations would hold:

(They wouldn't hold perfectly, as parsing might fail, but roughly so.)

Recognising a homomorphism can lead to useful refactorings. For instance, if  turned out to be an expensive operation it might be advantageous in terms of performance to concatenate the strings before parsing them. being a monoid homomorphism would then guarantee the same results would be obtained.