Write Yourself a Scheme in 48 Hours/Evaluation, Part 2

Additional Primitives: Partial Application
Now that we can deal with type errors, bad arguments, and so on, we'll flesh out our primitive list so that it does something more than calculate. We'll add boolean operators, conditionals, and some basic string operations.

Start by adding the following into the list of primitives:

These depend on helper functions that we haven't written yet:,   and. Instead of taking a variable number of arguments and returning an integer, these both take exactly two arguments and return a boolean. They differ from each other only in the type of argument they expect, so let's factor the duplication into a generic  function that's parametrized by the unpacker function it applies to its arguments:

Because each argument may throw a type mismatch, we have to unpack them sequentially, in a do-block (for the  monad). We then apply the operation to the two arguments and wrap the result in the  constructor. Any function can be turned into an infix operator by wrapping it in backticks.

Also, take a look at the type signature. takes two functions as its first two arguments: the first is used to unpack the arguments from s to native Haskell types, and the second is the actual operation to perform. By parameterizing different parts of the behavior, you make the function more reusable.

Now we define three functions that specialize  with different unpackers:

We haven't told Haskell how to unpack strings from s yet. This works similarly to, pattern matching against the value and either returning it or throwing an error. Again, if passed a primitive value that could be interpreted as a string (such as a number or boolean), it will silently convert it to the string representation.

And we use similar code to unpack booleans:

Let's compile and test this to make sure it's working, before we proceed to the next feature:

$ ghc -package parsec -o simple_parser [../code/listing6.1.hs listing6.1.hs] $ ./simple_parser "(&lt; 2 3)" #t $ ./simple_parser "(&gt; 2 3)" #f $ ./simple_parser "(&gt;= 3 3)" #t $ ./simple_parser "(string=? \"test\" \"test\")" #t $ ./simple_parser "(string&lt;? \"abc\" \"bba\")" #t

Conditionals: Pattern Matching 2
Now, we'll proceed to adding an if-clause to our evaluator. As with standard Scheme, our evaluator considers  to be false and any other value to be true:

As the function definitions are evaluated in order, be sure to place this one above  or it will throw a   error.

This is another example of nested pattern-matching. Here, we're looking for a 4-element list. The first element must be the atom. The others can be any Scheme forms. We take the first element, evaluate, and if it's false, evaluate the alternative. Otherwise, we evaluate the consequent.

Compile and run this, and you'll be able to play around with conditionals:

$ ghc -package parsec -o simple_parser [../code/listing6.2.hs listing6.2.hs] $ ./simple_parser "(if (&gt; 2 3) \"no\" \"yes\")" "yes" $ ./simple_parser "(if (= 3 3) (+ 2 3 (- 5 1)) \"unequal\")" 9

List Primitives:,, and
For good measure, let's also add in the basic list-handling primitives. Because we've chosen to represent our lists as Haskell algebraic data types instead of pairs, these are somewhat more complicated than their definitions in many Lisps. It's easiest to think of them in terms of their effect on printed S-expressions:


 * 1)   – not a list
 * 2)   –   only takes one argument
 * 1)   – not a list
 * 2)   –   only takes one argument
 * 1)   –   only takes one argument

We can translate these fairly straightforwardly into pattern clauses, recalling that  divides a list into the first element and the rest:

Let's do the same with cdr:


 * 1)   – not a list
 * 2)   – too many arguments
 * 1)   – not a list
 * 2)   – too many arguments
 * 1)   – not a list
 * 2)   – too many arguments
 * 1)   – too many arguments

We can represent the first three cases with a single clause. Our parser represents  as , and when you pattern-match   against  ,   is bound to. The other ones translate to separate clauses:

is a little tricky, enough that we should go through each clause case-by-case. If you cons together anything with, you end up with a one-item list, the   serving as a terminator:

If you  together anything and a list, it's like tacking that anything onto the front of the list:

However, if the list is a, then it should stay a  , taking into account the improper tail:

If you  together two non-lists, or put a list in front, you get a. This is because such a  cell isn't terminated by the normal   that most lists are.

Finally, attempting to  together more or less than two arguments is an error:

Our last step is to implement. Scheme offers three levels of equivalence predicates:, , and. For our purposes,  and   are basically the same: they recognize two items as the same if they print the same, and are fairly slow. So we can write one function for both of them and register it under  and.

Most of these clauses are self-explanatory, the exception being the one for two. This, after checking to make sure the lists are the same length, s the two lists of pairs, and then uses the function to return   if   returns   on any of the pairs. is an example of a local definition: it is defined using the  keyword, just like a normal function, but is available only within that particular clause of. Since we know that  only throws an error if the number of arguments is not 2, the line   will never be executed at the moment.

and Weak Typing: Heterogenous Lists
Since we introduced weak typing above, we'd also like to introduce an  function that ignores differences in the type tags and only tests if two values can be interpreted the same. For example,, yet we'd like. Basically, we want to try all of our unpack functions, and if any of them result in Haskell values that are equal, return.

The obvious way to approach this is to store the unpacking functions in a list and use  to execute them in turn. Unfortunately, this doesn't work, because standard Haskell only lets you put objects in a list if they're the same type. The various unpacker functions return different types, so you can't store them in the same list.

We'll get around this by using a GHC extension – Existential Types – that lets us create a heterogenous list, subject to typeclass constraints. Extensions are fairly common in the Haskell world: they're basically necessary to create any reasonably large program, and they're often compatible between implementations (existential types work in both Hugs and GHC and are a candidate for standardization). Note you need to use a special compiler flag for this:  as mentioned below; the newer  ; or add the pragma   to the beginning of your code (In general, the compiler flag   can be replaced by the pragma   inside the source file).

The first thing we need to do is define a data type that can hold any function from a, provided that that   supports equality:

This is like any normal algebraic datatype, except for the type constraint. It says, "For any type that is an instance of, you can define an   that takes a function from   to that type, and may throw an error". We'll have to wrap our functions with the  constructor, but then we can create a list of  s that does just what we want it.

Rather than jump straight to the  function, let's first define a helper function that takes an   and then determines if two  s are equal when it unpacks them:

After pattern-matching to retrieve the actual function, we enter a do-block for the  monad. This retrieves the Haskell values of the two s, and then tests whether they're equal. If there is an error anywhere within the two unpackers, it returns, using the  function because  expects a function to apply to the error value.

Finally, we can define  in terms of these helpers:

The first action makes a heterogenous list of, and then maps the partially applied   over it. This gives a list of booleans, so we use the Prelude function to return true if any single one of them is true.

The second action tests the two arguments with. Since we want  to be looser than , it should return true whenever   does so. This also lets us avoid handling cases like the list or dotted-list (though this introduces a bug; see exercise #2 in this section).

Finally,   s both of these values together and wraps the result in the   constructor, returning a. The  is a quick way of extracting a value from an algebraic type: it pattern matches   against the   value, and then returns. The result of a let-expression is the expression following the keyword.

To use these functions, insert them into our primitives list:

To compile this code, you need to enable GHC extensions with :

$ ghc -package parsec -fglasgow-exts -o parser [../code/listing6.4.hs listing6.4.hs] $ ./parser "(cdr '(a simple test))" (simple test) $ ./parser "(car (cdr '(a simple test)))" simple $ ./parser "(car '((this is) a test))" (this is) $ ./parser "(cons '(this is) 'test)" ((this is) . test) $ ./parser "(cons '(this is) ')" ((this is)) $ ./parser "(eqv? 1 3)" #f $ ./parser "(eqv? 3 3)" #t $ ./parser "(eqv? 'atom 'atom)" #t