How to Think Like a Computer Scientist: Learning with Python 2nd Edition/Dictionaries

= Dictionaries =

All of the compound data types we have studied in detail so far --- strings, lists, and tuples---are sequence types, which use integers as indices to access the values they contain within them.

Dictionaries are a different kind of compound type. They are Python's built-in mapping type. They map keys, which can be any immutable type, to values, which can be any type, just like the values of a list or tuple.

As an example, we will create a dictionary to translate English words into Spanish. For this dictionary, the keys are strings.

One way to create a dictionary is to start with the empty dictionary and add key-value pairs. The empty dictionary is denoted {}:

The first assignment creates a dictionary named eng2sp; the other assignments add new key-value pairs to the dictionary. We can print the current value of the dictionary in the usual way:

The key-value pairs of the dictionary are separated by commas. Each pair contains a key and a value separated by a colon.

The order of the pairs may not be what you expected. Python uses complex algorithms to determine where the key-value pairs are stored in a dictionary. For our purposes we can think of this ordering as unpredictable.

Another way to create a dictionary is to provide a list of key-value pairs using the same syntax as the previous output:

It doesn't matter what order we write the pairs. The values in a dictionary are accessed with keys, not with indices, so there is no need to care about ordering.

Here is how we use a key to look up the corresponding value:

The key 'two' yields the value 'dos'.

Dictionary operations
The del statement removes a key-value pair from a dictionary. For example, the following dictionary contains the names of various fruits and the number of each fruit in stock:

If someone buys all of the pears, we can remove the entry from the dictionary:

Or if we're expecting more pears soon, we might just change the value associated with pears:

The len function also works on dictionaries; it returns the number of key-value pairs:

Dictionary methods
Dictionaries have a number of useful built-in methods.

The keys method takes a dictionary and returns a list of its keys.

As we saw earlier with strings and lists, dictionary methods use dot notation, which specifies the name of the method to the right of the dot and the name of the object on which to apply the method immediately to the left of the dot. The parentheses indicate that this method takes no parameters.

A method call is called an invocation; in this case, we would say that we are invoking the keys method on the object <tt>eng2sp</tt>. As we will see in a few chapters when we talk about object oriented programming, the object on which a method is invoked is actually the first argument to the method.

The <tt>values</tt> method is similar; it returns a list of the values in the dictionary:

The <tt>items</tt> method returns both, in the form of a list of tuples --- one for each key-value pair:

The <tt>has_key</tt> method takes a key as an argument and returns <tt>True</tt> if the key appears in the dictionary and <tt>False</tt> otherwise:

This method can be very useful, since looking up a nonexistent key in a dictionary causes a runtime error:

Aliasing and copying
Because dictionaries are mutable, you need to be aware of aliasing. Whenever two variables refer to the same object, changes to one affect the other.

If you want to modify a dictionary and keep a copy of the original, use the <tt>copy</tt> method. For example, <tt>opposites</tt> is a dictionary that contains pairs of opposites:

<tt>alias</tt> and <tt>opposites</tt> refer to the same object; <tt>copy</tt> refers to a fresh copy of the same dictionary. If we modify <tt>alias</tt>, <tt>opposites</tt> is also changed:

If we modify <tt>copy</tt>, <tt>opposites</tt> is unchanged:

Sparse matrices
We previously used a list of lists to represent a matrix. That is a good choice for a matrix with mostly nonzero values, but consider a sparse matrix_ like this one:

The list representation contains a lot of zeros:

An alternative is to use a dictionary. For the keys, we can use tuples that contain the row and column numbers. Here is the dictionary representation of the same matrix:

We only need three key-value pairs, one for each nonzero element of the matrix. Each key is a tuple, and each value is an integer.

To access an element of the matrix, we could use the <tt>[]</tt> operator:

Notice that the syntax for the dictionary representation is not the same as the syntax for the nested list representation. Instead of two integer indices, we use one index, which is a tuple of integers.

There is one problem. If we specify an element that is zero, we get an error, because there is no entry in the dictionary with that key:

The <tt>get</tt> method solves this problem:

The first argument is the key; the second argument is the value <tt>get</tt> should return if the key is not in the dictionary:

<tt>get</tt> definitely improves the semantics of accessing a sparse matrix. Shame about the syntax.

Hints
If you played around with the <tt>fibonacci</tt> function from the last chapter, you might have noticed that the bigger the argument you provide, the longer the function takes to run. Furthermore, the run time increases very quickly. On one of our machines, <tt>fibonacci(20)</tt> finishes instantly, <tt>fibonacci(30)</tt> takes about a second, and <tt>fibonacci(40)</tt> takes roughly forever.

To understand why, consider this call graph for <tt>fibonacci</tt> with <tt>n = 4</tt>:

A call graph shows a set function frames, with lines connecting each frame to the frames of the functions it calls. At the top of the graph, <tt>fibonacci</tt> with <tt>n = 4</tt> calls <tt>fibonacci</tt> with <tt>n = 3</tt> and <tt>n = 2</tt>. In turn, <tt>fibonacci</tt> with <tt>n = 3</tt> calls <tt>fibonacci</tt> with <tt>n = 2</tt> and <tt>n = 1</tt>. And so on.

Count how many times <tt>fibonacci(0)</tt> and <tt>fibonacci(1)</tt> are called. This is an inefficient solution to the problem, and it gets far worse as the argument gets bigger.

A good solution is to keep track of values that have already been computed by storing them in a dictionary. A previously computed value that is stored for later use is called a hint. Here is an implementation of <tt>fibonacci</tt> using hints:

The dictionary named <tt>previous</tt> keeps track of the Fibonacci numbers we already know. We start with only two pairs: 0 maps to 1; and 1 maps to 1.

Whenever <tt>fibonacci</tt> is called, it checks the dictionary to determine if it contains the result. If it's there, the function can return immediately without making any more recursive calls. If not, it has to compute the new value. The new value is added to the dictionary before the function returns.

Using this version of <tt>fibonacci</tt>, our machines can compute <tt>fibonacci(100)</tt> in an eyeblink.

The <tt>L</tt> at the end of the number indicates that it is a <tt>long</tt> integer.

Long integers
Python provides a type called <tt>long</tt> that can handle any size integer (limited only by the amount of memory you have on your computer).

There are three ways to create a <tt>long</tt> value. The first one is to compute an arithmetic expression too large to fit inside an <tt>int</tt>. We already saw this in the <tt>fibonacci(100)</tt> example above. Another way is to write an integer with a capital <tt>L</tt> at the end of your number:

The third is to call <tt>long</tt> with the value to be converted as an argument. <tt>long</tt>, just like <tt>int</tt> and <tt>float</tt>, can convert <tt>int</tt>s, <tt>floats</tt>, and even strings of digits to long integers:

Counting letters
In Chapter 7, we wrote a function that counted the number of occurrences of a letter in a string. A more general version of this problem is to form a histogram of the letters in the string, that is, how many times each letter appears.

Such a histogram might be useful for compressing a text file. Because different letters appear with different frequencies, we can compress a file by using shorter codes for common letters and longer codes for letters that appear less frequently.

Dictionaries provide an elegant way to generate a histogram:

We start with an empty dictionary. For each letter in the string, we find the current count (possibly zero) and increment it. At the end, the dictionary contains pairs of letters and their frequencies.

It might be more appealing to display the histogram in alphabetical order. We can do that with the <tt>items</tt> and <tt>sort</tt> methods:

The game
In this case study we will write a version of the classic console based game, robots_.

Robots is a turn-based game in which the protagonist, you, are trying to stay alive while being chased by stupid, but relentless robots. Each robot moves one square toward you each time you move. If they catch you, you are dead, but if they collide they die, leaving a pile of dead robot junk in their wake. If other robots collide with the piles of junk, they die.

The basic strategy is to position yourself so that the robots collide with each other and with piles of junk as they move toward you. To make the game playable, you also are given the ability to teleport to another location on the screen -- 3 times safely and randomly thereafter, so that you don't just get forced into a corner and loose every time.

Setting up the world, the player, and the main loop
Let's start with a program that places the player on the screen and has a function to move her around in response to keys pressed:

Programs like this one that involve interacting with the user through events such as key presses and mouse clicks are called event-driven programs_.

The main event loop at this stage is simply:

The event handling is done inside the <tt>move_player</tt> function. <tt>update_when('key_pressed')</tt> waits until a key has been pressed before moving to the next statement. The multi-way branching statement then handles the all keys relevant to game play.

Pressing the escape key causes <tt>move_player</tt> to return <tt>True</tt>, making <tt>not finished</tt> false, thus exiting the main loop and ending the game. The 4, 7, 8, 9, 6, 3, 2, and 1 keys all cause the player to move in the appropriate direction, if she isn't blocked by the edge of a window.

Adding a robot
Now let's add a single robot that heads toward the player each time the player moves.

Add the following <tt>place_robot</tt> function between <tt>place_player</tt> and <tt>move_player</tt>:

Add <tt>move_robot</tt> immediately after <tt>move_player</tt>:

We need to pass both the robot and the player to this function so that it can compare their locations and move the robot toward the player.

Now add the line <tt>robot = place_robot</tt> in the main body of the program immediately after the line <tt>player = place_player</tt>, and add the <tt>move_robot(robot, player)</tt> call inside the main loop immediately after <tt>finished = move_player(player)</tt>.

Checking for Collisions
We now have a robot that moves relentlessly toward our player, but once it catches her it just follows her around wherever she goes. What we want to happen is for the game to end as soon as the player is caught. The following function will determine if that has happened:

Place this new function immediately below the <tt>move_player</tt> function. Now let's modify <tt>play_game</tt> to check for collisions:

We rename the variable <tt>finished</tt> to <tt>defeated</tt>, which is now set to the result of <tt>collided</tt>. The main loop runs as long as <tt>defeated</tt> is false. Pressing the key still ends the program, since we check for <tt>quit</tt> and break out of the main loop if it is true. Finally, we check for <tt>defeated</tt> immediately after the main loop and display an appropriate message if it is true.

Adding more robots
There are several things we could do next:


 * give the player the ability to teleport to another location to escape pursuit.
 * provide safe placement of the player so that it never starts on top of a robot.
 * add more robots.

Adding the ability to teleport to a random location is the easiest task, and it has been left to you to complete as an exercise.

How we provide safe placement of the player will depend on how we represent multiple robots, so it makes sense to tackle adding more robots first.

To add a second robot, we could just create another variable named something like <tt>robot2</tt> with another call to <tt>place_robot</tt>. The problem with this approach is that we will soon want lots of robots, and giving them all their own names will be cumbersome. A more elegant solution is to place all the robots in a list:

Now instead of calling <tt>place_robot</tt> in <tt>play_game</tt>, call <tt>place_robots</tt>, which returns a single list containing all the robots:

With more than one robot placed, we have to handle moving each one of them. We have already solved the problem of moving a single robot, however, so traversing the list and moving each one in turn does the trick:

Add <tt>move_robots</tt> immediately after <tt>move_robot</tt>, and change <tt>play_game</tt> to call <tt>move_robots</tt> instead of <tt>move_robot</tt>.

We now need to check each robot to see if it has collided with the player:

Add <tt>check_collisions</tt> immediately after <tt>collided</tt> and change the line in <tt>play_game</tt> that sets <tt>defeated</tt> to call <tt>check_collisions</tt> instead of <tt>collided</tt>.

Finally, we need to loop over <tt>robots</tt> to remove each one in turn if <tt>defeated</tt> becomes true. Adding this has been left as an exercise.

Winning the game
The biggest problem left in our game is that there is no way to win. The robots are both relentless and indestructible. With careful maneuvering and a bit of luck teleporting, we can reach the point where it appears there is only one robot chasing the player (all the robots will actually just be on top of each other). This moving pile of robots will continue chasing our hapless player until it catches it, either by a bad move on our part or a teleport that lands the player directly on the robots.

When two robots collide they are supposed to die, leaving behind a pile of junk. A robot (or the player) is also supposed to die when it collides with a pile of junk. The logic for doing this is quite tricky. After the player and each of the robots have moved, we need to:


 * 1) Check whether the player has collided with a robot or a pile of junk. If so, set <tt>defeated</tt> to true and break out of the game loop.
 * 2) Check each robot in the <tt>robots</tt> list to see if it has collided with a pile of junk. If it has, disregard the robot (remove it from the <tt>robots</tt> list).
 * 3) Check each of the remaining robots to see if they have collided with another robot. If they have, discard all the robots that have collided and place a pile of junk at the locations they occupied.
 * 4) Check if any robots remain. If not, end the game and mark the player the winner.

Let's take on each of these tasks in turn.

Adding <tt>junk</tt>
Most of this work will take place inside our <tt>check_collisions</tt> function. Let's start by modifying <tt>collided</tt>, changing the names of the parameters to reflect its more general use:

We now introduce a new empty list named <tt>junk</tt> immediately after the call to <tt>place_robots</tt>:

and modify <tt>check_collisions</tt> to incorporate the new list:

Be sure to modify the call to <tt>check_collisions</tt> (currently <tt>defeated = check_collisions(robots, player)</tt>) to include <tt>junk</tt> as a new argument.

Again, we need to fix the logic after <tt>if defeated:</tt> to remove the new <tt>junk</tt> from the screen before displaying the They got you! message:

Since at this point <tt>junk</tt> is always an empty list, we haven't changed the behavior of our program. To test whether our new logic is actually working, we could introduce a single junk pile and run our player into it, at which point the game should remove all items from the screen and display the ending message.

It will be helpful to modify our program temporarily to change the random placement of robots and player to predetermined locations for testing. We plan to use solid boxes to represent junk piles. We observe that placing a robot is very similar to placing a junk pile, and modify <tt>place_robot</tt> to do both:

Notice that <tt>x</tt> and <tt>y</tt> are now parameters, along with a new parameter that we will use to set <tt>filled</tt> to true for piles of junk.

Our program is now broken, since the call in <tt>place_robots</tt> to <tt>place_robot</tt> does not pass arguments for <tt>x</tt> and <tt>y</tt>. Fixing this and setting up the program for testing is left to you as an exercise.

Removing robots that hit junk
To remove robots that collide with piles of junk, we add a nested loop to <tt>check_collisions</tt> between each robot and each pile of junk. Our first attempt at this does not work:

Running this new code with the program as setup in exercise 11, we find a bug. It appears that the robots continue to pass through the pile of junk as before.

Actually, the bug is more subtle. Since we have two robots on top of each other, when the collision of the first one is detected and that robot is removed, we move the second robot into the first position in the list and it is missed by the next iteration. It is generally dangerous to modify a list while you are iterating over it. Doing so can introduce a host of difficult to find errors into your program.

The solution in this case is to loop over the <tt>robots</tt> list backwards, so that when we remove a robot from the list all the robots whose list indexes change as a result are robots we have already evaluated.

As usual, Python provides an elegant way to do this. The built-in function, <tt>reversed</tt> provides for backward iteration over a sequence. Replacing:

with:

will make our program work the way we intended.

Turning robots into junk and enabling the player to win
We now want to check each robot to see if it has collided with any other robots. We will remove all robots that have collided, leaving a single pile of junk in their wake. If we reach a state where there are no more robots, the player wins.

Once again we have to be careful not to introduce bugs related to removing things from a list over which we are iterating.

Here is the plan:


 * 1) Check each robot in <tt>robots</tt> (an outer loop, traversing forward).
 * 2) Compare it with every robot that follows it (an inner loop, traversing backward).
 * 3) If the two robots have collided, add a piece of junk at their location, mark the first robot as junk, and remove the second one.
 * 4) Once all robots have been checked for collisions, traverse the robots list once again in reverse, removing all robots marked as junk.
 * 5) Check to see if any robots remain. If not, declare the player the winner.

Adding the following to <tt>check_collisions</tt> will accomplish most of what we need to do:

We make use of the <tt>enumerate</tt> function we saw in Chapter 9 to get both the index and value of each robot as we traverse forward. Then a reverse traversal of the slice of the remaining robots, <tt>reversed(robots[index+1:])</tt>, sets up the collision check.

Whenever two robots collide, our plan calls for adding a piece of junk at that location, marking the first robot for later removal (we still need it to compare with the other robots), and immediately removing the second one. The body of the <tt>if collided(robot1, robot2):</tt> conditional is designed to do just that, but if you look carefully at the line:

you should notice a problem. <tt>robot1['junk']</tt> will result in a syntax error, since our robot dictionary does not yet contain a <tt>'junk'</tt> key. To fix this we modify <tt>place_robot</tt> to accommodate the new key:

It is not at all unusual for data structures to change as program development proceeds. Stepwise refinement of both program data and logic is a normal part of the structured programming process.

After <tt>robot1</tt> is marked as junk, we add a pile of junk to the junk list at the same location with <tt>junk.append(place_robot(robot1['x'], robot1['y'], True))</tt>, and then remove <tt>robot2</tt> from the game by first removing its shape from the graphics window and then removing it from the robots list.

The next loop traverses backward over the robots list removing all the robots previously marked as junk. Since the player wins when all the robots die, and the robot list will be empty when it no longer contains live robots, we can simply check whether <tt>robots</tt> is empty to determine whether or not the player has won.

This can be done in <tt>check_collisions</tt> immediately after we finish checking robot collisions and removing dead robots by adding:

Hmmm... What should we return? In its current state, <tt>check_collisions</tt> is a Boolean function that returns true when the player has collided with something and lost the game, and false when the player has not lost and the game should continue. That is why the variable in the <tt>play_game</tt> function that catches the return value is called <tt>defeated</tt>.

Now we have three possible states:


 * 1) <tt>robots</tt> is not empty and the player has not collided with anything -- the game is still in play
 * 2) the player has collided with something -- the robots win
 * 3) the player has not collided with anything and <tt>robots</tt> is empty -- the player wins

In order to handle this with as few changes as possible to our present program, we will take advantage of the way that Python permits sequence types to live double lives as Boolean values. We will return an empty string -- which is false -- when game play should continue, and either <tt>&quot;robots_win&quot;</tt> or <tt>&quot;player_wins&quot;</tt> to handle the other two cases. <tt>check_collisions</tt> should now look like this:

A few corresponding changes need to be made to <tt>play_game</tt> to use the new return values. These are left as an exercise.

Exercises
<ol> <li> Write a program that reads in a string on the command line and returns a table of the letters of the alphabet in alphabetical order which occur in the string together with the number of times each letter occurs. Case should be ignored. A sample run of the program would look this this: $ python letter_counts.py &quot;ThiS is String with Upper and lower case Letters.&quot; a 2 c 1 d 1 e 5 g 1 h 2 i 4 l 2 n 2 o 1 p 2 r 4 s 5 t 5 u 1 w 2 $ </li> <li> Give the Python interpreter's response to each of the following from a continuous interpreter session: <dl> <dt>#.</dt> <dd></dd> <dt>#.</dt> <dd></dd> <dt>#.</dt> <dd></dd> <dt>#.</dt> <dd></dd> <dt>#.</dt> <dd></dd> <dt>#.</dt> <dd></dd> <dt>#.</dt> <dd></dd></dl>

Be sure you understand why you get each result. Then apply what you have learned to fill in the body of the function below:

Your solution should pass the doctests.</li> <li> Write a program called <tt>alice_words.py</tt> that creates a text file named <tt>alice_words.txt</tt> containing an alphabetical listing of all the words found in alice_in_wonderland.txt_ together with the number of times each word occurs. The first 10 lines of your output file should look something like this: Word             Count

=
========== a                631 a-piece          1 abide            1 able             1 about            94 above            3 absence          1 absurd           2 How many times does the word, <tt>alice</tt>, occur in the book?</li> <li>What is the longest word in Alice in Wonderland ? How many characters does it have?</li> <li>Copy the code from the Setting up the world, the player, and the main loop section into a file named <tt>robots.py</tt> and run it. You should be able to move the player around the screen using the numeric keypad and to quit the program by pressing the escape key.</li> <li>Laptops usually have smaller keyboards than desktop computers that do not include a separate numeric keypad. Modify the robots program so that it uses 'a', 'q', 'w', 'e', 'd', 'c', 'x', and 'z' instead of '4', '7', '8', '9', '6', '3', '2', and '1' so that it will work on a typical laptop keyboard.</li> <li>Add all the code from the Adding a robot section in the places indicated. Make sure the program works and that you now have a robot following around your player.</li> <li>Add all the code from the Checking for Collisions section in the places indicated. Verify that the program ends when the robot catches the player after displaying a They got you! message for 3 seconds.</li> <li>Modify the <tt>move_player</tt> function to add the ability for the player to jump to a random location whenever the <tt>0</tt> key is pressed. (hint: <tt>place_player</tt> already has the logic needed to place the player in a random location. Just add another conditional branch to <tt>move_player</tt> that uses this logic when <tt>key_pressed('0')</tt> is true.) Test the program to verify that your player can now teleport to a random place on the screen to get out of trouble.</li> <li>Make all the changes to your program indicated in Adding more robots. Be sure to loop over the <tt>robots</tt> list, removing each robot in turn, after <tt>defeated</tt> becomes true. Test your program to verify that there are now two robots chasing your player. Let a robot catch you to test whether you have correctly handled removing all the robots. Change the argument from 2 to 4 in <tt>robots = place_robots(2)</tt> and confirm that you have 4 robots.</li> <li> Make the changes to your program indicated in Adding <tt>junk</tt>. Fix <tt>place_robots</tt> by moving the random generation of values for <tt>x</tt> and <tt>y</tt> to the appropriate location and passing these values as arguments in the call to <tt>place_robot</tt>. Now we are ready to make temporary modifications to our program to remove the randomness so we can control it for testing. We can start by placing a pile of junk in the center of our game board. Change:

to:

Run the program and confirm that there is a black box in the center of the board. Now change <tt>place_player</tt> so that it looks like this:

Finally, temporarily comment out the random generation of <tt>x</tt> and <tt>y</tt> values in <tt>place_robots</tt> and the creation of <tt>numbots</tt> robots. Replace this logic with code to create two robots in fixed locations:

When you start your program now, it should look like this: When you run this program and either stay still (by pressing the <tt>5</tt> repeatedly) or move away from the pile of junk, you can confirm that the robots move through it unharmed. When you move into the junk pile, on the other hand, you die.</li> <li> Make the following modifications to <tt>play_game</tt> to integrate with the changes made in Turning robots into junk and enabling the player to win: <ol> <li>Rename <tt>defeated</tt> to <tt>winner</tt> and initialize it to the empty string instead of <tt>False</tt>.</li></ol> </li></ol>