Haskell/Pattern matching

In the previous modules, we introduced and made occasional reference to pattern matching. Now that we have developed some familiarity with the language, it is time to take a proper, deeper look. We will kick-start the discussion with a condensed description, which we will expand upon throughout the chapter:

In pattern matching, we attempt to match values against patterns and, if so desired, bind variables to successful matches.

Analysing pattern matching
Pattern matching is virtually everywhere. For example, consider this definition of :

At surface level, there are four different patterns involved, two per equation.


 * is a pattern which matches anything at all, and binds the  variable to whatever is matched.
 * is a pattern that matches a non-empty list which is formed by something (which gets bound to the  variable) which was cons'd (by the   function) onto something else (which gets bound to  ).
 * is a pattern that matches the empty list. It doesn't bind any variables.
 * is the pattern which matches anything without binding (wildcard, "don't care" pattern).

In the  pattern,   and   can be seen as sub-patterns used to match the parts of the list. Just like, they match anything - though it is evident that if there is a successful match and   has type  ,   will have type. Finally, these considerations imply that  will also match an empty list, and so a one-element list matches.

From the above dissection, we can say pattern matching gives us a way to:


 * recognize values. For instance, when  is called and the second argument matches   the first equation for   is used instead of the second one.
 * bind variables to the recognized values. In this case, the variables,  , and   are assigned to the values passed as arguments to   when the second equation is used, and so we can use these values through the variables in the right-hand side of  . As   and   show, binding is not an essential part of pattern matching, but just a side effect of using variable names as patterns.
 * break down values into parts, as the  pattern does by binding two variables to parts (head and tail) of a matched argument (the non-empty list).

The connection with constructors
Despite the detailed analysis above, it may seem a little too magical how we break down a list as if we were undoing the effects of the  operator. Be careful: this process will not work with any arbitrary operator. For example, one might think of defining a function which uses  to chop off the first three elements of a list:

But that will not work. The function  is not allowed in patterns. In fact, most other functions that act on lists are similarly prohibited from pattern matching. Which functions, then, are allowed?

In one word, constructors – the functions used to build values of algebraic data types. Let us consider a random example:

Here  and   are constructors for the type. You can use them for pattern matching  values and bind variables to the   value contained in a   constructed with  :

This is exactly like  and   in the Type declarations module. For instance:

The  pattern in the left-hand side of the   definition matches a   (built with the   constructor) and binds the variables ,   and   to the contents of the   value.

Why does it work with lists?
As for lists, they are no different from -defined algebraic data types as far as pattern matching is concerned. It works as if lists were defined with this  declaration (note that the following isn't actually valid syntax: lists are actually too deeply ingrained into Haskell to be defined like this):

So the empty list,  and the   function are constructors of the list datatype, and so you can pattern match with them. takes no arguments, and therefore no variables can be bound when it is used for pattern matching. takes two arguments, the list head and tail, which may then have variables bound to them when the pattern is recognized.

Prelude> :t [] [] :: [a] Prelude> :t :: a -> [a] -> [a]

Furthermore, since  is just syntactic sugar for , we can achieve something like   using pattern matching alone:

The first pattern will match any list with at least three elements. The catch-all second definition provides a reasonable default when lists fail to match the main pattern, and thus prevents runtime crashes due to pattern match failure.

Tuple constructors
Analogous considerations are valid for tuples. Our access to their components via pattern matching...

... is granted by the existence of tuple constructors. For pairs, the constructor is the comma operator, ; for larger tuples there are  ;   and so on. These operators are slightly unusual in that we can't use them infix in the regular way; so  is not a valid way to write. All of them, however, can be used prefix, which is occasionally useful.

Prelude> 5 3 (5,3) Prelude> "George" "John" "Paul" "Ringo" ("George","John","Paul","Ringo")

Matching literal values
As discussed earlier in the book, a simple piece-wise function definition like this one

is performing pattern matching as well, matching the argument of  with the   literals 0, 1 and 2, and finally with. In general, numeric and character literals can be used in pattern matching on their own as well as together with constructor patterns. For instance, this function

will evaluate to False for the [0] list, to True if the list has 0 as first element and a non-empty tail and to False in all other cases. Also, lists with literal elements like [1,2,3], or even "abc" (which is equivalent to ['a','b','c']) can be used for pattern matching as well, since these forms are only syntactic sugar for the constructor.

The above considerations are only valid for literal values, so the following will not work:

As-patterns
Sometimes, when matching a sub-pattern within a value, it may be useful to bind a name to the whole value being matched. As-patterns allow exactly this: they are of the form   and have the additional effect to bind the name  to the whole value being matched by. For instance, here is a toy variation on the map theme:

passes to the parameter function  not only   but also the undivided list used as argument of each recursive call. Writing it without as-patterns would have been a bit clunky because we would have to either use  or needlessly reconstruct the original value of , i.e. actually evaluate   on the right side:

Introduction to records
For constructors with many elements, records provide a way of naming values in a datatype using the following syntax:

Using records allows doing matching and binding only for the variables relevant to the function we're writing, making code much clearer:

Also, the  pattern can be used for matching a constructor regardless of the datatype elements even if you don't use records in the   declaration:

The function  does not have to be changed if we modify the number or the type of elements of the constructors   or.

There are further advantages to using record syntax which we will cover in more details in the Named fields section of the More on datatypes chapter.

Where we can use pattern matching
The short answer is that wherever you can bind variables, you can pattern match. Let us have a glance at such places we have seen before; a few more will be introduced in the following chapters.

Equations
The most obvious use case is the left-hand side of function definition equations, which were the subject of our examples so far.

In the  definition we're doing pattern matching on the left hand side of both equations, and also binding variables on the second one.

expressions and clauses
Both  and   are ways of doing local variable bindings. As such, you can also use pattern matching in them. A simple example:

Or, equivalently,

Here,  will be bound to the first element of. , therefore, will evaluate to $$2 + 5 = 7$$.

Lambda abstractions
Pattern matching can be used directly in lambda abstractions:

It is clear, however, that this syntax permits only one pattern (or one for each argument in the case of a multi-argument lambda abstraction).

List comprehensions
After the  in list comprehensions you can pattern match. This is actually extremely useful, and adds a lot to the expressiveness of comprehensions. Let's see how that works with a slightly more sophisticated example. Prelude provides a  type which has the following constructors:

It is typically used to hold values resulting from an operation which may or may not succeed; if the operation succeeds, the  constructor is used and the value is passed to it; otherwise   is used. The utility function  (which is available from Data.Maybe library module) takes a list of Maybes (which may contain both "Just" and "Nothing" Maybes), and retrieves the contained values by filtering out the   values and getting rid of the   wrappers of the. Writing it with list comprehensions is very straightforward:

Another nice thing about using a list comprehension for this task is that if the pattern match fails (that is, it meets a Nothing) it just moves on to the next element in, thus avoiding the need of explicitly handling constructors we are not interested in with alternate function definitions.

blocks
Within a  block like the ones we used in the Simple input and output chapter, we can pattern match with the left-hand side of the left arrow variable bindings:

Furthermore, the  bindings in   blocks are, as far as pattern matching is concerned, just the same as the "real" let expressions.