Prolog/Sorting

Sorting is a fundamental task in many programs. We explain how to write an implementation of the fast and general merge sort algorithm in Prolog.

The basic algorithm
The merge sort algorithm looks as follows:

function merge_sort(m) if length(m) ≤ 1 return m    else left, right = split_in_half(m) left = merge_sort(left) right = merge_sort(right) result = merge(left, right) return result

The first thing we have to do to implement any algorithm presented in this style in Prolog is to translate it from the language of functions to the language of predicates. Recall that Prolog has no return construct, but we can use an unbound variable as an output mechanism. So, we will write a predicate mergesort(Xs, S) that, given a list Xs, "returns" a sorted variant by binding S to it.

How about the if-then-else construct in the algorithm? We can translate that directly, but we also have the option of using Prolog's pattern matching. Because the conditional checks for ≤1, we have three structural cases: the empty list, a list of one element and the catch-all case. So, let's write our first merge sort.

% the empty list must be "returned" as-is mergesort([], []). % same for single element lists mergesort([X], [X]). % and now for the catch-all: mergesort([X|Xs], S) :- length(Xs, Len), 0 < Len, split_in_half([X|Xs], Ys, Zs), mergesort(Ys, SY), mergesort(Zs, SZ), merge(SY, SZ, S).

There's the skeleton of the algorithm. We have left out the definition of the helper predicates split_in_half and merge so far, but for a functioning sorting predicate, we must of course define those as well. Let's start with merge, which takes two sorted lists and produces a new sorted list containing all the elements in them. We use Prolog's recursion and pattern matching to implement merge.

% First two cases: merging any list Xs with an empty list yields Xs merge([], Xs, Xs). merge(Xs, [], Xs). % Other cases: the @=< predicate compares terms by the "standard order" merge([X|Xs], [Y|Ys], [X|S]) :- X @=< Y,    merge(Xs, [Y|Ys], S). merge([X|Xs], [Y|Ys], [Y|S]) :- Y @=< X,    merge([X|Xs], Ys, S).

This predicate works, but it's repetitive and not very efficient. We'll revisit it in a minute, but first we must define <tt>split_in_half</tt>, which splits a list into two lists of roughly equal size (roughly because it may have an odd number of elements). To do so, we need the length of a list, which we unfortunately do not get as input (see <tt>merge_sort</tt>), so we need to compute that. We use a helper predicate <tt>split_at</tt> to actually split the list after computing the length.

split_in_half(Xs, Ys, Zs) :- length(Xs, Len), Half is Len // 2,   % // denotes integer division, rounding down split_at(Xs, Half, Ys, Zs).

Exercise. Before continuing to the definition of <tt>split_at</tt>, try to implement it yourself. Use recursion and case matching on both the input list and the second, length, argument.

Okay, now for <tt>split_at</tt>:

% split_at(Xs, N, Ys, Zs) divides Xs into a list Ys of length N % and a list Zs containing the part after the first N. split_at(Xs, N, Ys, Zs) :- length(Ys, N), append(Ys, Zs, Xs).

If the definition of <tt>split_at</tt> seems magical, then try to read it declaratively: "a list <tt>Xs</tt>, split after its first <tt>N</tt> elements into <tt>Ys</tt> and <tt>Zs</tt>, means that <tt>Ys</tt> has length <tt>N</tt> and <tt>Ys</tt> and <tt>Zs</tt> can be appended to retrieve <tt>Xs</tt>." The fact that this works is an example of Prolog's "bidirectional" power: many predicates can be run in "reverse order" to get the inverse of a computation.

Cleaning up
We now have a merge sort program that will sort any list, but some parts of it aren't particularly elegant. Let's first revisit <tt>merge_sort</tt>. We translated an if-then-else construct to multiple clauses since that's so common in Prolog, but we really had no reason to do so: the algorithm is deterministic, so there should no backtracking going on. Let's try a more direct rewrite of the pseudocode using Prolog's own if-then-else.

mergesort(Xs, S) :- length(Xs, Len), (Len =< 2 ->        S = Xs     ;         split_in_half(Xs, Ys, Zs),         mergesort(Ys, SY),         mergesort(Zs, SZ),         merge(SY, SZ, S)     ).

Now, let's tackle <tt>merge</tt>'s efficiency and readability, as promised. The last two cases of this predicate look like a copy-paste, which is always a code smell. Also, there may be unnecessary backtracking going on. Let's rewrite <tt>merge</tt> using an if-then-else as well:

% First two cases: merging any list Xs with an empty list yields Xs merge([], Xs, Xs). merge(Xs, [], Xs). % Other cases: the @=< predicate compares terms by the "standard order" merge(Xs, Ys, S) :- Xs = [X|Xs0], Ys = [Y|Ys0], (X @=< Y ->        S = [X|S0],         merge(Xs0, Ys, S0)     ;         S = [Y|S0],         merge(Xs, Ys0, S0)     ).