User:Kearsley

Functional Programming
A function takes a number of arguments and returns a value based on manipulating those arguments. That's a very abstract definition for something absolutely fundamental to the style of programming encouraged by Clojure.

To look at it more concretely, a function takes some (or no) values, operates on them, then returns a value.

The simplest function possible in Clojure is:

(fn [])

It takes no values and returns nil as a result. Now, this function is not the same as the value of nil. Instead, it represents a process that gives nil as an answer.

Just typing (fn []) into the prompt, though, returns the function, but then throws it away. Were we using it within something else, we would write:

user=> (let [always-nil (fn [])]           (always-nil)) nil

What that does is say, first off, that within the opening and closing parentheses around the word let, the words always-nil refer to (fn []). That's known as binding: always-nil is bound to (fn []).

Then, in the second part, always-nil is called, meaning that whatever operations it is to perform are done, and then the value that is to be returned is placed where the function was called.

So, (always-nil) means "call the always-nil function, do whatever it is to do, then replace the statement (always-nil) with the value that the function returns.

Or, to take that step-by-step:

Because always-nil is the first item after the opening parenthesis, it is called.

(always-nil)

Clojure then looks up what the words always-nil mean to it, and, because this is within the parentheses around let, the words are bound to (fn []). So we substitute in the value to which always-nil is bound.

((fn []))

Next, the function is evaluated. As said before, this function takes no values and returns nil. Thus, nil is substituted in for the function:

nil

While that's the simplest function possible, it's kind of boring and silly. It doesn't do anything but say "no" to us. And, while "no" is an awfully fundamental word, it's not exactly the friendliest thing.

The koala sorter
(defn classify-koala [extinct really-big] (let [valid? (fn [answer] (some (appl eql? answer) (list :yes :no))) no? (appl eql? :no)] (cond (not (and (valid? extinct)                   (valid? really-big)))         :unknown          (no? extinct)         :koala          (no? really-big)         :phacolarctos-maris          true          :giant-koala)))

I'll admit that that example is equally silly, but it's a lot friendlier and cuter, because it deals with koalas. What it does, in brief, is take two arguments answering the questions "is this species of koala extinct?" and "is it really big?", then classifies the koala according to those.

Now, the first thing to look at is what defn means. Where let said that we were binding a value available inside the parentheses around that let, def says that we are defining a value accessable everywhere. defn is shorthand to say (def name (fn [arguments] ... )), or, "define this function everywhere." That's going to be useful, because I intend on classifying a lot of koalas.

Inside the let, we've defined two functions: valid? and no?. valid? asks, "is my argument either yes or no?" and no? asks, "is my argument no?"

Finally, we apply a series of conditions, first separating out the things that aren't koalas:

user=> (classify-koala :deciduous :leafy)
 * unknown

Then the koalas that aren't extinct:

user=> (classify-koala :no :yes)
 * koala

Then the extinct koalas that weren't all that big:

user=> (classify-koala :yes :no)
 * phacolarctos-maris

And finally, the extinct koalas that were really really big:

user=> (classify-koala :yes :yes)
 * giant-koala

Of course, after all that, while the behaviour of this koala sorter may be apparent, how it actually gets from (classify-koala :yes :yes) to the value of :giant-koala involves thirteen functions working in harmony and making new functions in order to classify our koalas for us.

Luckily, as we saw above, the set of conditions to classify a koala is remarkably small. This is because we're working in a somewhat contrived example, rather than the real world. Later on, when this koala classifier moves from the calm shores of eucalyptowiki to the big, scary world of production, it's going to have to face up against a mad scientist with an army of robot wallabies, and the example will become really contrived instead of just somewhat.

Building our own koala sorter
Let's start, then, by actually making a koala sorter. However, let's not worry about being fancy with it yet. All we actually need are eql? and cond.

eql? takes a set of arguments and returns true if they're all equal, false if any of them are different.

cond takes a set of arguments, checks if the first one is true, and if so, returns the second one. If not, cond checks the third and, if it's true, returning the fourth one. It does that until it runs out of arguments.

So, from those two pieces, we can build the koala sorter, mark i:

(defn classify-koala [extinct really-big] (cond (eql? extinct :no) :koala (eql? really-big :no) :phacolarctos-maris true :giant-koala))

Which checks extinction, the koala being really big, then returns our koala. (true is used as the second-last argument because we want to always return the last argument And, testing it, it does indeed, classifying koalas until one day someone hooks it up to the tree sorter instead:

user=> (classify-koala :no :no)
 * koala

user=> (classify-koala :yes :no)
 * phacolarctos-maris

user=> (classify-koala :yes :yes)
 * giant-koala

user=> (classify-koala :evergreen :bush)
 * giant-koala

And we find out that the juniper bush that the tree sorter was pointed at is actually a Pleistocene marsupial. This causes a brief furor in the botanosophic community, and the scientist discovering the long-lost giant koala in the front garden of his monastery goes on to great acclaim as the discoverer of ... No, actually, he's defrocked and sent to the colonies for having published in the Byzantine Journal of Natural Philosophy based on a mistake.

So, we need to fix that bug before we're done. At the beginning of the cond statement, we'll check that the values of extinct and really-big are valid for koalas.

(defn classify-koala [extinct really-big] (cond (not (and (or (eql? extinct :yes)                     (eql? extinct :no)) (or (eql? really-big :yes)                     (eql? really-big :no)))) :unknown (eql? extinct :no) :koala (eql? really-big :no) :phacolarctos-maris true :giant-koala))

Which does the job, except that, well, the logic here has become hard to follow unless you already know how koalas are classified. Let's look at that first condition. Well, it seems that we're asking the same question over and over, so let's break it off into its own function:

(defn classify-koala [extinct really-big] (let [yesno? (fn [answer] (or (eql? answer :yes) (eql? answer :no)))] (cond (not (and (yesno? extinct)                   (yesno? really-big)))         :unknown          (eql? extinct :no)         :koala          (eql? really-big :no)         :phacolarctos-maris          true          :giant-koala)))

Better, but we're still asking (eql? something :no) all over the place. And, in fact, clojure has a function to take a function like eql? that takes several arguments and make it into a function that takes fewer arguments: appl.

So, if we write (appl eql? :no), that will return a function that asks if its arguments are equal to :no. It's almost the same as writing (fn [answer] (eql? :no answer)) except that, if it takes more than one argument, appl won't break, whereas that function that I wrote will.

So, let's use appl and add an explicit test for something being "no".

(defn classify-koala [extinct really-big] (let [yesno? (fn [answer] (or (eql? answer :yes) (eql? answer :no))) no? (appl eql? :no)] (cond (not (and (yesno? extinct)                   (yesno? really-big)))         :unknown          (no? extinct)         :koala          (no? really-big)         :phacolarctos-maris          true          :giant-koala)))

Better still.

Which brings up the question, is there a way to condense yesno? with appl? yesno? repeats the same question twice, but, unlike with no?, the part that remains the same is answer, not :no.

Well, appl will get us halfway there, but trying to build this with just appl will result in yesno? growing again with no real gain.

Well, let's look at it another way, is there anything that yesno? does that we don't need? Well, no. It checks if a value is :yes or :no. How about the other way around? What if we need to add :maybe later? Well, then we'd add a new line, check that the parentheses all match up...

Or, maybe we could just say "if the answer is in this list, it's a valid answer." After all, we're just using yesno? to check that we don't get natural philosophers defrocked for hooking up the wrong input to the koala classifier.

For that, we'll need some. It returns the first item in a list for which the function we give it returns true. And, with the functions we have, <tt>(appl eql? answer)</tt> should return true if the answer and the item are the same.

Also, may as well rename yesno? to valid? as it's now checking valid input, rather than strictly yes or no.

(defn classify-koala [extinct really-big] (let [valid? (fn [answer] (some (appl eql? answer) (:yes :no))) no? (appl eql? :no)] (cond (not (and (valid? extinct)                   (valid? really-big)))         :unknown          (no? extinct)         :koala          (no? really-big)         :phacolarctos-maris          true          :giant-koala)))

And that one loads fine, but fails to run. Why? Well, the first thing after a parenthesis is supposed to be the name of a function. And :yes is not the name of a function. So, the final step is to use the function list to say that <tt>(:yes :no)</tt> is actually a list of things, rather than a function called :yes being called on the argument :no.

(defn classify-koala [extinct really-big] (let [valid? (fn [answer] (some (appl eql? answer) (list :yes :no))) no? (appl eql? :no)] (cond (not (and (valid? extinct)                   (valid? really-big)))         :unknown          (no? extinct)         :koala          (no? really-big)         :phacolarctos-maris          true          :giant-koala)))

And that's the not-so-short story of how one builds a function that actually does something and uses things that Clojure is good at in order to keep the flow of the function as clean as possible.

All of the functions that we've been using (appl, and, cond, defn, eql?, fn, let, list, no?, not, some, valid? and, of course, classify-koala) meet the nebulous description at the top that a function is something that takes some values, manipulates them and returns a value.

What is functional programming?
Does this count as functional programming? Well, yes. Are we doing it deliberately? Not quite. Mostly we've applied functional methods because that's what Clojure encourages by language design.

Let's add in another constraint, though. A pure function should not change anything outside itself. That means that it takes arguments, returns a value, but nothing else changes.

Our functions have all been pure functions, but that's partially happenstance. They didn't exactly have anything outside of themselves to change, so they failed to alter it. On the other hand, nothing outside of classify-koala will change when it's called, so it counts as being a pure function.

Functional programming, though, is the process of using primarily pure functions to write programs. By not changing anything outside themselves, it means that these functions can be used as building blocks from which to build a program without having to worry that the situation outside the function has been properly set up.

Higher-order functions
One of the fundamental ways in which this is done is via higher-order functions. These are functions that take a function as an argument. In making the classify-koala function, we used several higher-order functions, specifically appl and some.

Let's look at valid? specifically. And how valid? differs from the some and appl inside.

(let [valid? (fn [answer] (some (appl eql? answer)                                 (list :yes :no)))]  (valid? :yes))

So, valid? takes answer as an argument and passes it into <tt>(appl eql? answer)</tt>. eql?, though, is a function. So appl is taking a function and an arbitrary argument and returning a function that performs the operations <tt>(fn [arguments] (eql? answer arguments))</tt>

So, appl is a function that takes a function as an argument and returns a modified version of that same function. Making it, by definition, higher-order.

But what about some? We passed it a call to appl, as well as a list. Well, since appl returns a function, it gives that function to some as an argument.

Finally, valid? itself takes a single argument, but that is not used later on as a function, so valid? is a regular function.

Well, this is all getting a little abstract, so let's look at what it is that <tt>(valid? extinct)</tt> does when it's called.

First, the value to which valid? is bound is looked up:

((fn [answer] (some (appl eql? answer)                   (list :yes :no)))  :yes)

Then, valid?'s arguments (in this case, answer) are bound to the value being passed to valid?:

(some (appl eql? :yes)     (list :yes :no))

Next, valid? is evaluated. some is the first thing inside valid?, but appl is a function and the first thing inside some, so that becomes the first thing evaluated.

appl evaluates to a function:

(some (fn [arguments] (eql? :yes arguments))     (list :yes :no))

Then some calls the function that was passed to it as an argument on each of the items in the list, passing the item as the argument.

(or (eql? :yes :yes)   (eql? :yes :no))

Then each of the eql?s are evaluated in order, with or returning true if any of them are true.

(or true   (eql? :yes :no))

Finally, because there is a true found, that true is the result of the evaluation.

true

So, that was rather a lot of math to say nothing that we hadn't already done. We used higher-order functions earlier to simplify logic and the flow in our classify-koala function. However, those same things could have been written without using higher-order functions to differentiate between koalas. Why are they useful, then?

Mapping and filtering
Well, the main purpose for higher-order functions in a functional language is when you want to do something repeatedly. You define the function once and call it on all the things for which you want to calculate.

I'd said that the mad scientist would make an appearance sooner or later. Well, Horace Smythe, a renegade Australian natural philosopher wants to send us a list of koalas he's seen, in order to get them classified. We've agreed that he'll send the list as a pair of yes/no values, indicating what he knows about these koalas.

((:yes :yes) (:yes :no) (:yes :yes) (:no :no) (:yes :yes) (:yes :yes) (:no :no) (:no :no) (:yes :yes) (:yes :no))

So the first tape comes in, and all we have to do is enter each of these pairs of koala values into classify-koala and send him back the list before the Byzantine authorities find out that we're helping a natural philosopher who plays by his own rules.

Except that Smythe apparently has a backlog of koala sightings and he keeps sending us more and more koalas as little lists. The time taken to classify these is pulling us away from working on the much more difficult task of classifying Finnish lichen. What we need is a way to take a list of values, run classify-koala on each value, and get a list of what these koalas were.

That's where the map function comes in. map takes a function and a list, then calls that function on each element of the list, returning a list of the return values of that function.

So, we use map and classify-koala over Smythe's list and get:

user=> (map classify-koala           '((:yes :yes) (:yes :no) (:yes :yes) (:no :no) (:yes :yes) (:yes :yes) (:no :no) (:no :no) (:yes :yes) (:yes :no))) java.lang.IllegalArgumentException: Wrong number of args passed

An error. Should have known that it wouldn't have been that easy. Let's look at why it happened. classify-koala takes two arguments: extinct and really-big. map, on the other hand, passes <tt>(:yes :yes)</tt> then <tt>(:yes :no)</tt>.

In other words, we need to extract the data before passing it on to classify-koala. While we could change classify-koala to reflect a pair of yes/no answers, a better solution would be to write an additional function that takes the list of koalas and runs the map on it itself.

(defn classify-koala-list [koala-list] (map classify-koala koala-list))

So, that's what we had before, only extracted into its own function. The next step, then, is to actually unpack the data.

(defn classify-koala-list [koala-list] (map (fn [koala] (classify-koala (first koala) (second koala))) koala-list))

Better. This takes our koala list, unpacks each item and feeds it to classify-koala. And trying it out, we get:

user=> (classify-koala-list '((:yes :yes) (:yes :no) (:yes :yes) (:no :no) (:yes :yes) (:yes :yes) (:no :no) (:no :no) (:yes :yes) (:yes :no))) (:giant-koala :phacolarctos_maris :giant-koala :koala :giant-koala :giant-koala :koala :koala :giant-koala :phacolarctos_maris)