How to Think Like a Computer Scientist: Learning with Python 2nd Edition/Recursion and exceptions

= Recursion and exceptions =

Tuples and mutability
So far, you have seen two compound types: strings, which are made up of characters; and lists, which are made up of elements of any type. One of the differences we noted is that the elements of a list can be modified, but the characters in a string cannot. In other words, strings are immutable and lists are mutable.

A tuple, like a list, is a sequence of items of any type. Unlike lists, however, tuples are immutable. Syntactically, a tuple is a comma-separated sequence of values:

Although it is not necessary, it is conventional to enclose tuples in parentheses:

To create a tuple with a single element, we have to include the final comma:

Without the comma, Python treats (5) as an integer in parentheses:

Syntax issues aside, tuples support the same sequence operations as strings and lists. The index operator selects an element from a tuple.

And the slice operator selects a range of elements.

But if we try to use item assignment to modify one of the elements of the tuple, we get an error:

Of course, even if we can't modify the elements of a tuple, we can replace it with a different tuple:

Alternatively, we could first convert it to a list, modify it, and convert it back into a tuple:

Tuple assignment
Once in a while, it is useful to swap the values of two variables. With conventional assignment statements, we have to use a temporary variable. For example, to swap a and b:

If we have to do this often, this approach becomes cumbersome. Python provides a form of tuple assignment that solves this problem neatly:

The left side is a tuple of variables; the right side is a tuple of values. Each value is assigned to its respective variable. All the expressions on the right side are evaluated before any of the assignments. This feature makes tuple assignment quite versatile.

Naturally, the number of variables on the left and the number of values on the right have to be the same:

Tuples as return values
Functions can return tuples as return values. For example, we could write a function that swaps two parameters:

Then we can assign the return value to a tuple with two variables:

In this case, there is no great advantage in making swap a function. In fact, there is a danger in trying to encapsulate swap, which is the following tempting mistake:

If we call this function like this:

then a and x are aliases for the same value. Changing x inside <tt>swap</tt> makes <tt>x</tt> refer to a different value, but it has no effect on <tt>a</tt> in <tt>__main__</tt>. Similarly, changing <tt>y</tt> has no effect on <tt>b</tt>.

This function runs without producing an error message, but it doesn't do what we intended. This is an example of a semantic error.

Pure functions and modifiers revisited
In :ref:`pure-func-mod` we discussed pure functions and modifiers as related to lists. Since tuples are immutable we can not write modifiers on them.

Here is a modifier that inserts a new value into the middle of a list:

We can run it to see that it works:

If we try to use it with a tuple, however, we get an error:

The problem is that tuples are immutable, and don't support slice assignment. A simple solution to this problem is to make <tt>insert_in_middle</tt> a pure function:

This version now works for tuples, but not for lists or strings. If we want a version that works for all sequence types, we need a way to encapsulate our value into the correct sequence type. A small helper function does the trick:

Now we can write <tt>insert_in_middle</tt> to work with each of the built-in sequence types:

The last two versions of <tt>insert_in_middle</tt> are pure functions. They don't have any side effects. Adding <tt>encapsulate</tt> and the last version of <tt>insert_in_middle</tt> to the <tt>seqtools.py</tt> module, we can test it:

The values of <tt>my_string</tt>, <tt>my_list</tt>, and <tt>my_tuple</tt> are not changed. If we want to use <tt>insert_in_middle</tt> to change them, we have to assign the value returned by the function call back to the variable:

Recursive data structures
All of the Python data types we have seen can be grouped inside lists and tuples in a variety of ways. Lists and tuples can also be nested, providing myriad possibilities for organizing data. The organization of data for the purpose of making it easier to use is called a data structure.

It's election time and we are helping to compute the votes as they come in. Votes arriving from individual wards, precincts, municipalities, counties, and states are sometimes reported as a sum total of votes and sometimes as a list of subtotals of votes. After considering how best to store the tallies, we decide to use a nested number list, which we define as follows:

A nested number list is a list whose elements are either:

<ol style="list-style-type: lower-alpha;"> <li>numbers</li> <li>nested number lists</li></ol>

Notice that the term, nested number list is used in its own definition. Recursive definitions like this are quite common in mathematics and computer science. They provide a concise and powerful way to describe recursive data structures that are partially composed of smaller and simpler instances of themselves. The definition is not circular, since at some point we will reach a list that does not have any lists as elements.

Now suppose our job is to write a function that will sum all of the values in a nested number list. Python has a built-in function which finds the sum of a sequence of numbers:

For our nested number list, however, <tt>sum</tt> will not work:

The problem is that the third element of this list, <tt>[11, 13]</tt>, is itself a list, which can not be added to <tt>1</tt>, <tt>2</tt>, and <tt>8</tt>.

Recursion
To sum all the numbers in our recursive nested number list we need to traverse the list, visiting each of the elements within its nested structure, adding any numeric elements to our sum, and repeating this process with any elements which are lists.

Modern programming languages generally support recursion, which means that functions can call themselves within their definitions. Thanks to recursion, the Python code needed to sum the values of a nested number list is surprisingly short:

The body of <tt>recursive_sum</tt> consists mainly of a <tt>for</tt> loop that traverses <tt>nested_num_list</tt>. If <tt>element</tt> is a numerical value (the <tt>else</tt> branch), it is simply added to <tt>sum</tt>. If <tt>element</tt> is a list, then <tt>recursive_sum</tt> is called again, with the element as an argument. The statement inside the function definition in which the function calls itself is known as the recursive call.

Recursion is truly one of the most beautiful and elegant tools in computer science.

A slightly more complicated problem is finding the largest value in our nested number list:

Doctests are included to provide examples of <tt>recursive_max</tt> at work.

The added twist to this problem is finding a numerical value for initializing <tt>largest</tt>. We can't just use <tt>nested_num_list[0]</tt>, since that my be either a number or a list. To solve this problem we use a while loop that assigns <tt>largest</tt> to the first numerical value no matter how deeply it is nested.

The two examples above each have a base case which does not lead to a recursive call: the case where the element is a number and not a list. Without a base case, you have infinite recursion, and your program will not work. Python stops after reaching a maximum recursion depth and returns a runtime error.

Write the following in a file named <tt>infinite_recursion.py</tt>:

At the unix command prompt in the same directory in which you saved your program, type the following:

python infinite_recursion.py After watching the messages flash by, you will be presented with the end of a long traceback that ends in with the following:

We would certainly never want something like this to happen to a user of one of our programs, so before finishing the recursion discussion, let's see how errors like this are handled in Python.

Exceptions
Whenever a runtime error occurs, it creates an exception. The program stops running at this point and Python prints out the traceback, which ends with the exception that occured.

For example, dividing by zero creates an exception:

So does accessing a nonexistent list item:

Or trying to make an item assignment on a tuple:

In each case, the error message on the last line has two parts: the type of error before the colon, and specifics about the error after the colon.

Sometimes we want to execute an operation that might cause an exception, but we don't want the program to stop. We can handle the exception using the <tt>try</tt> and <tt>except</tt> statements.

For example, we might prompt the user for the name of a file and then try to open it. If the file doesn't exist, we don't want the program to crash; we want to handle the exception:

The <tt>try</tt> statement executes the statements in the first block. If no exceptions occur, it ignores the <tt>except</tt> statement. If any exception occurs, it executes the statements in the <tt>except</tt> branch and then continues.

We can encapsulate this capability in a function: <tt>exists</tt> takes a filename and returns true if the file exists, false if it doesn't:

You can use multiple <tt>except</tt> blocks to handle different kinds of exceptions (see the Errors and Exceptions_ lesson from Python creator Guido van Rossum's Python Tutorial_ for a more complete discussion of exceptions).

If your program detects an error condition, you can make it raise an exception. Here is an example that gets input from the user and checks that the number is non-negative.

The <tt>raise</tt> statement takes two arguments: the exception type, and specific information about the error. <tt>ValueError</tt> is the built-in exception which most closely matches the kind of error we want to raise. The complete listing of built-in exceptions is found in section 2.3_ of the Python Library Reference_, again by Python's creator, Guido van Rossum.

If the function that called <tt>get_age</tt> handles the error, then the program can continue; otherwise, Python prints the traceback and exits:

The error message includes the exception type and the additional information you provided.

Using exception handling, we can now modify <tt>infinite_recursion.py</tt> so that it stops when it reaches the maximum recursion depth allowed:

Run this version and observe the results.

Tail recursion
When a recursive call occurs as the last line of a function definition, it is refered to as tail recursion.

Here is a version of the <tt>countdown</tt> function from chapter 6 written using tail recursion:

Any computation that can be made using iteration can also be made using recursion.

Several well known mathematical functions are defined recursively. Factorial_, for example, is given the special operator, <tt>!</tt>, and is defined by:

0! = 1 n! = n(n-1) We can easily code this into Python:

Another well know recursive relation in mathematics is the fibonacci sequence_, which is defined by:

fibonacci(0) = 1 fibonacci(1) = 1 fibonacci(n) = fibonacci(n-1) + fibonacci(n-2) This can also be written easily in Python:

Both <tt>factorial</tt> and <tt>fibonacci</tt> are examples of tail recursion.

Tail recursion is considered a bad practice in languages like Python, however, since it uses more system resources than the equivalent iterative solution.

Calling <tt>factorial(1000)</tt> will exceed the maximum recursion depth. And try running <tt>fibonacci(35)</tt> and see how long it takes to complete (be patient, it will complete).

You will be asked to write an iterative version of <tt>factorial</tt> as an exercise, and we will see a better way to handle <tt>fibonacci</tt> in the next chapter.

List comprehensions
A list comprehension is a syntactic construct that enables lists to be created from other lists using a compact, mathematical syntax:

The general syntax for a list comprehension expression is:

This list expression has the same effect as:

As you can see, the list comprehension is much more compact.

Mini case study: tree
The following program implements a subset of the behavior of the Unix tree_ program.

You will be asked to explore this program in several of the exercises below.

Exercises
<dl> <dt>#.</dt> <dd> Run this program and describe the results. Use the results to explain why this version of <tt>swap</tt> does not work as intended. What will be the values of <tt>a</tt> and <tt>b</tt> after the call to <tt>swap</tt>? </dd> <dt>#. Create a module named <tt>seqtools.py</tt>. Add the functions <tt>encapsulate</tt> and</dt> <dd> <tt>insert_in_middle</tt> from the chapter. Add doctests which test that these two functions work as intended with all three sequence types. </dd></dl>

<ol> <li> Add each of the following functions to <tt>seqtools.py</tt>:

As usual, work on each of these one at a time until they pass all of the doctests.</li> <li> Write a function, <tt>recursive_min</tt>, that returns the smallest value in a nested number list:

Your function should pass the doctests.</li> <li> Write a function <tt>recursive_count</tt> that returns the number of occurances of <tt>target</tt> in <tt>nested_number_list</tt>:

As usual, your function should pass the doctests.</li> <li> Write a function <tt>flatten</tt> that returns a simple list of numbers containing all the values in a <tt>nested_number_list</tt>:

Run your function to confirm that the doctests pass.</li> <li> Write a function named <tt>readposint</tt> that prompts the user for a positive integer and then checks the input to confirm that it meets the requirements. A sample session might look like this:

Use Python's exception handling mechanisms in confirming that the user's input is valid.</li> <li> Give the Python interpreter's response to each of the following: <dl> <dt>#.</dt> <dd></dd> <dt>#.</dt> <dd></dd> <dt>#.</dt> <dd></dd> <dt>#.</dt> <dd></dd></dl>

You should anticipate the results before you try them in the interpreter.</li> <li>Use either <tt>pydoc</tt> or the on-line documentation at [http://pydoc.org http://pydoc.org]_ to find out what <tt>sys.getrecursionlimit</tt> and <tt>sys.setrecursionlimit(n)</tt> do. Create several experiments like what was done in <tt>infinite_recursion.py</tt> to test your understanding of how these module functions work.</li> <li>Rewrite the <tt>factorial</tt> function using iteration instead of recursion. Call your new function with 1000 as an argument and make note of how fast it returns a value.</li> <li> Write a program named <tt>litter.py</tt> that creates an empty file named <tt>trash.txt</tt> in each subdirectory of a directory tree given the root of the tree as an argument (or the current directory as a default). Now write a program named <tt>cleanup.py</tt> that removes all these files. Hint: Use the <tt>tree</tt> program from the mini case study as a basis for these two recursive programs. </li></ol>