Introduction to Programming Languages/Functional Data Structures

Most data structures used to implement Abstract Data Types such as queues, stacks and sequences were designed with an imperative mindset. They usually assume data is mutable and random access to memory is fast.

In functional languages such as ML and Haskell, random access is not the rule, but the exception. These languages discourage mutability and random access and some forbid it altogether. Nevertheless, we can still implement most of the more popular data-structures in these languages, keeping the same complexity bounds that we are likely to find in imperative implementations. We shall see some examples of such implementations in the rest of this section.

Stacks
A stack is an abstract data type which holds a collection of items. It is capable of executing two operations: push and pop. Push is responsible for adding a new item to the stack and pop retrieves the last item inserted, if there is one. Stacks are a type of LIFO data structure, an acronym which stands for Last-In, First-Out.

Stacks can be implemented in many different ways. In python, for instance, we can use the builtin lists to do it as follows:

Since both append and pop are $$ O(1) $$ in python, our stack implementation is capable of conducting both operations in constant time .

The ML implementation is as follows:

There are a few differences between this implementation and its python counterpart that are worth noticing. First of all, observe how pattern matching is used on both push and pop to obtain the contents of the stack.

Both implementations chose to store the contents of the stack within a list. Lists are built into both python and ML. However, they are used in different ways in these languages. In ML, lists are Algebraic Data Types backed by a little syntax sugar. They can be pattern matched on and have an special operator denoted by :: which we call cons. The cons operator allows us to extract the first element of a list (also called its head) and to append a new element to an existing list. The empty list is denoted by nil.

Another important difference appears on the implementation of pop. SML discourages the usage of mutable data. To work around that, we have to return both the popped item and the stack that is obtained after the operation is carried out. This is a pattern common to most functional implementations of data structures. When using our stack, we need to keep track of the most recent version. For instance, in an ML prompt we could write:

Since both cons and pattern matching are performed in constant time, push and pop in our ML implementation are also $$ O(1) $$. As we will see later on this chapter, this is not always the case. It is not uncommon for functional implementations of abstract data types to be slower by a factor of $$ O(\log n) $$ or more than their imperative counterparts.

Queues
Another common data structure is the queue. Just like stacks, queues are containers that support two operations: push and pop. However, while stacks provide Last-In First-Out ( LIFO ) access, queues are First-In First-Out ( FIFO ) structures. This acronym means that the first item inserted on the queue with push is also the first item to be returned by pop. Once more, we will begin our discussion with a simple python implementation.

Just like our stack example, this first python implementation uses a list to store the enqueued items. However, the time complexities are much worse than in our first example. While push remains $$ O(1) $$, pop is now $$ O(n) $$, because .pop(0) is also $$ O(n) $$. 

We can improve upon our first attempt to implement queues in at least two ways. The first way is to roll out our own list implementation instead of using python's built in lists. In doing that, we can include links to both the previous and the next element, effectively implementing a data structure called Doubly linked list. We present the code for this new implementation bellow:

This code has much better asymptotic behavior than the previous implementation. Thanks to the doubly linked list, both push and pop are now $$ O(1) $$. However, the program became a lot more complex due to the many pointer manipulations that now are required. To push a new item, for instance, we have to manipulate 4 references that connect nodes. That can be quite hard to get right.

A better approach is to use two python built-in lists instead of one. We call these lists push_list and pop_list. The first, push_list, stores elements when they arrive. The list pop_list is where elements are popped from. When pop_list is empty, we simply populate it moving the elements from push_list.

This code has the desired asymptotic properties: both push and pop are $$ O(1) $$. To see why that is the case, consider a sequence of $$n$$ push and $$n$$ pop operations. Each push is $$ O(1) $$, as it simply appends the elements to a python list. As we have seen before, the append operation is $$ O(1) $$ in python lists. The operation pop, on the other hand, might be more expensive because items sometimes have to be moved. However, each pushed item is moved at most once. So the while loop inside pop cannot iterate for more than $$n$$ times in total when $$n$$ items are popped. This means that, while some pop operations might be a little more expensive, each pop will run in what is called amortized constant time Amortized analysis.

The trick of using two python lists to get $$ O(1) $$ complexity is useful beyond python. To see why that is the case, notice that both push_list and pop_list are being used as stacks. All we do is to append items to their front and remove items from their back. As we have seen before, stacks are easy to implement in ML.

But before diving into the ML implementation, we need to make a small digression. It so happens that in order to implement an efficient queue in ML, we need to be able to reverse a list in $$ O(n) $$ time. On a first attempt, one might try to use the following code to do that:

That code, however, takes quadratic time. The problem is the @ operator used to merge the two lists when reversing. It takes time which is linear on the size of the first list. Since @ is used once per element and there are n elements, the complexity becomes $$ O(n^2) $$

We can work around this shortcoming introducing a second parameter, which stores the reversed list while this list is constructed.

We might wrap the function above in a call that initializes the extra parameter, to give it a nicer interface:

Now that we can reverse lists in $$ O(n) $$, push and pop can be implemented within the same complexity bounds as we have seen in python. The code is as follows:

Binary Trees
Binary search trees are among the most important data structures in computer science. They can store sets of elements in an ordered way and, if we are lucky, conduct the following operations on these sets in O(log n):


 * Insert a new element
 * Search an element
 * Find the minimum element

By being lucky, we mean that there are sequences of operations that will get these complexities down to O(n). We will show how to deal with them in the next session. It is also worth noticing that this is merely a subset of what trees can do. However, some operations such as deletion can become rather complex and we will not be covering them here.

Trees are made of nodes. Each node contains at least tree things: its data and two subtrees which we will call left and right. Left contains elements strictly smaller than the data stored on the node, whereas right contains elements which are greater. Both left and right are trees themselves, which leads to the following recursive definition in ML:

As usual, we begin with a python implementation of the three operations: insert, search and find minimum. Just like the datatype above, the python implementation starts with a description of a tree.

This code is actually much more functional than the examples we have seen before. Notice how the tree operations are implemented recursively. In fact, they can be translated to ML rather easily. The equivalent ML code is shown below.

Notice how the two implementations are quite similar. This happens because algorithms that manipulate trees have a simple recursive definition, and recursive algorithms are a natural fit for functional programming languages.

There is, however, one important difference. Like what happened on the stack implementation, a new tree has to be returned after an insertion. It is the responsibility of the caller to keep track of that. Below, we present an example of how the tree above could be used from the ML prompt.

Tries
The tree implementation discussed on the previous section has a serious drawback. For certains sequences of insertions, it can become too deep, which in turn increases the lookup and insertion times to $$ O(n) $$. To see that, consider this sequence of insertions:

Although the ML interpreter stops printing the complete tree after a few insertions, the pattern is clear. By inserting nodes in order, the tree effectively becomes a list. Each node has only one child subtree, which is the one at its left.

There are many approaches to solve this problem. One popular approach is to use a balanced binary tree such as a Red-Black tree. Balanced trees incorporate some changes to the insertion algorithm to make sure insertions do not make the tree too deep. However, these changes can be complex.

Another approach is to incorporate an extra key to each node. It can be shown that if these extra keys are chosen randomly and we are able to write the insertion procedure in such a way that the extra key of a node is always bigger than the extra keys of its children, then the resulting tree will likely be shallow. This is known as Treap. Treaps are much easier to implement than balanced trees. However, simpler alternatives still exist.

A third approach is to explore particular properties of the keys. This is the approach that we will follow in this section. Specifically, we will assume that the items stored on nodes are fixed size integers. Given this assumption, we will use each bit in the key's binary representation to position it on the tree. This yields a data structure called Trie. Tries are much simpler to maintain, because they do not require balancing.

This time, we will begin with the ML implementation. A Trie is simply a tree with two kinds of nodes: internal nodes and leafs. We can declare it as follows:

Each node corresponds to a bit in the binary representation of the keys. The position of the bit is given by the depth of the node. For instance, the root node corresponds to the leftmost bit and its children correspond to the second bit from left to right. Keys that begin with 0 are recursively stored in the left subtree. We call this left subtree low. Keys that have 1 on the leftmost bit are stored to the right and we call that subtree high. The special value Empty is used to indicate that there is nothing on a particular subtree. Leaf indicates that we have reached the last bit of a key and that it is indeed present.

Since ML has no builtin bitwise operations, we begin by defining two functions to recover a bit in the binary representation of a number and to set it to one. Notice that these operations are linear on the number of bits that a key might have. In a language with support to bitwise operations, such as C or Java, all these operations could be implemented to run in O(1).

The code of the trie follows. It implements the same operations of the binary tree with a fundamental advantage: as the bits are assigned to nodes based on how deep they are on the tree and keys are usually small, the tree can never become too deep. In fact, its depth never exceeds $$ O(\log n) $$ where $$ n $$ is the maximum value any key can have. Thanks to that property, the operations described are also $$ O(\log n) $$. In typical hardware, $$\log n$$ never exceeds 64.

The three functions work by transversing the trie while keeping track of the bit they are currently dealing with. To make things easier, we can use the following aliases assuming that keys are no longer than 10 bits.

Using the trie implementation from the command prompt is just as easy as using a binary tree.