How to Think Like a Computer Scientist: Learning with Python 2nd Edition/Classes and methods

= Classes and methods =

Object-oriented features
Python is an object-oriented programming language, which means that it provides features that support object-oriented programming.

It is not easy to define object-oriented programming, but we have already seen some of its characteristics:


 * 1) Programs are made up of object definitions and function definitions, and most of the computation is expressed in terms of operations on objects.
 * 2) Each object definition corresponds to some object or concept in the real world, and the functions that operate on that object correspond to the ways real-world objects interact.

For example, the Time class defined in the last chapter corresponds to the way people record the time of day, and the functions we defined correspond to the kinds of things people do with times. Similarly, the Point and Rectangle classes correspond to the mathematical concepts of a point and a rectangle.

So far, we have not taken advantage of the features Python provides to support object-oriented programming. Strictly speaking, these features are not necessary. For the most part, they provide an alternative syntax for things we have already done, but in many cases, the alternative is more concise and more accurately conveys the structure of the program.

For example, in the Time program, there is no obvious connection between the class definition and the function definitions that follow. With some examination, it is apparent that every function takes at least one Time object as a parameter.

This observation is the motivation for methods. We have already seen some methods, such as keys and values, which were invoked on dictionaries. Each method is associated with a class and is intended to be invoked on instances of that class.

Methods are just like functions, with two differences:


 * 1) Methods are defined inside a class definition in order to make the relationship between the class and the method explicit.
 * 2) The syntax for invoking a method is different fromp the syntax for calling a function.

In the next few sections, we will take the functions from the previous two chapters and transform them into methods. This transformation is purely mechanical; you can do it simply by following a sequence of steps. If you are comfortable converting from one form to another, you will be able to choose the best form for whatever you are doing.

print_time
In the last chapter, we defined a class named <tt>Time</tt> and you wrote a function named <tt>print_time</tt>, which should have looked something like this:

To call this function, we passed a <tt>Time</tt> object as a parameter:

To make <tt>print_time</tt> a method, all we have to do is move the function definition inside the class definition. Notice the change in indentation.

Now we can invoke <tt>print_time</tt> using dot notation.

As usual, the object on which the method is invoked appears before the dot and the name of the method appears after the dot.

The object on which the method is invoked is assigned to the first parameter, so in this case <tt>current_time</tt> is assigned to the parameter <tt>time</tt>.

By convention, the first parameter of a method is called <tt>self</tt>. The reason for this is a little convoluted, but it is based on a useful metaphor.

The syntax for a function call, <tt>print_time(current_time)</tt>, suggests that the function is the active agent. It says something like, Hey <tt>print_time</tt>! Here's an object for you to print.

In object-oriented programming, the objects are the active agents. An invocation like <tt>current_time.print_time</tt> says Hey <tt>current_time</tt>! Please print yourself!

This change in perspective might be more polite, but it is not obvious that it is useful. In the examples we have seen so far, it may not be. But sometimes shifting responsibility from the functions onto the objects makes it possible to write more versatile functions, and makes it easier to maintain and reuse code.

Another example
Let's convert <tt>increment</tt> to a method. To save space, we will leave out previously defined methods, but you should keep them in your version:

The transformation is purely mechanical - we move the method definition into the class definition and change the name of the first parameter.

Now we can invoke <tt>increment</tt> as a method.

Again, the object on which the method is invoked gets assigned to the first parameter, <tt>self</tt>. The second parameter, <tt>seconds</tt> gets the value <tt>500</tt>.

A more complicated example
The <tt>after</tt> function is slightly more complicated because it operates on two <tt>Time</tt> objects, not just one. We can only convert one of the parameters to <tt>self</tt>; the other stays the same:

We invoke this method on one object and pass the other as an argument:

You can almost read the invocation like English: If the done-time is after the current-time, then...

Optional arguments
We have seen built-in functions that take a variable number of arguments. For example, <tt>string.find</tt> can take two, three, or four arguments.

It is possible to write user-defined functions with optional argument lists. For example, we can upgrade our own version of <tt>find</tt> to do the same thing as <tt>string.find</tt>.

This is the original version:

This is the new and improved version:

The third parameter, <tt>start</tt>, is optional because a default value, <tt>0</tt>, is provided. If we invoke <tt>find</tt> with only two arguments, we use the default value and start from the beginning of the string:

If we provide a third parameter, it overrides the default:

The initialization method
The initialization method is a special method that is invoked when an object is created. The name of this method is <tt>__init__</tt> (two underscore characters, followed by <tt>init</tt>, and then two more underscores). An initialization method for the <tt>Time</tt> class looks like this:

There is no conflict between the attribute <tt>self.hours</tt> and the parameter <tt>hours</tt>. Dot notation specifies which variable we are referring to.

When we invoke the <tt>Time</tt> constructor, the arguments we provide are passed along to <tt>init</tt>:

Because the parameters are optional, we can omit them:

Or provide only the first parameter:

Or the first two parameters:

Finally, we can provide a subset of the parameters by naming them explicitly:

Points revisited
Let's rewrite the <tt>Point</tt> class from chapter 12 in a more object- oriented style:

The initialization method takes <tt>x</tt> and <tt>y</tt> values as optional parameters; the default for either parameter is 0.

The next method, <tt>__str__</tt>, returns a string representation of a <tt>Point</tt> object. If a class provides a method named <tt>__str__</tt>, it overrides the default behavior of the Python built-in <tt>str</tt> function.

Printing a <tt>Point</tt> object implicitly invokes <tt>__str__</tt> on the object, so defining <tt>__str__</tt> also changes the behavior of <tt>print</tt>:

When we write a new class, we almost always start by writing <tt>__init__</tt>, which makes it easier to instantiate objects, and <tt>__str__</tt>, which is almost always useful for debugging.

Operator overloading
Some languages make it possible to change the definition of the built- in operators when they are applied to user-defined types. This feature is called operator overloading. It is especially useful when defining new mathematical types.

For example, to override the addition operator <tt>+</tt>, we provide a method named <tt>__add__</tt>:

As usual, the first parameter is the object on which the method is invoked. The second parameter is conveniently named <tt>other</tt> to distinguish it from <tt>self</tt>. To add two <tt>Point</tt>s, we create and return a new <tt>Point</tt> that contains the sum of the <tt>x</tt> coordinates and the sum of the <tt>y</tt> coordinates.

Now, when we apply the <tt>+</tt> operator to <tt>Point</tt> objects, Python invokes <tt>__add__</tt>:

The expression <tt>p1 + p2</tt> is equivalent to <tt>p1.__add__(p2)</tt>, but obviously more elegant. As an exercise, add a method <tt>__sub__(self, other)</tt> that overloads the subtraction operator, and try it out. There are several ways to override the behavior of the multiplication operator: by defining a method named <tt>__mul__</tt>, or <tt>__rmul__</tt>, or both.

If the left operand of <tt>*</tt> is a <tt>Point</tt>, Python invokes <tt>__mul__</tt>, which assumes that the other operand is also a <tt>Point</tt>. It computes the dot product of the two points, defined according to the rules of linear algebra:

If the left operand of <tt>*</tt> is a primitive type and the right operand is a <tt>Point</tt>, Python invokes <tt>__rmul__</tt>, which performs scalar multiplication:

The result is a new <tt>Point</tt> whose coordinates are a multiple of the original coordinates. If <tt>other</tt> is a type that cannot be multiplied by a floating-point number, then <tt>__rmul__</tt> will yield an error.

This example demonstrates both kinds of multiplication:

What happens if we try to evaluate <tt>p2 * 2</tt>? Since the first parameter is a <tt>Point</tt>, Python invokes <tt>__mul__</tt> with <tt>2</tt> as the second argument. Inside <tt>__mul__</tt>, the program tries to access the <tt>x</tt> coordinate of <tt>other</tt>, which fails because an integer has no attributes:

Unfortunately, the error message is a bit opaque. This example demonstrates some of the difficulties of object-oriented programming. Sometimes it is hard enough just to figure out what code is running.

For a more complete example of operator overloading, see Appendix (reference overloading).

Polymorphism
Most of the methods we have written only work for a specific type. When you create a new object, you write methods that operate on that type.

But there are certain operations that you will want to apply to many types, such as the arithmetic operations in the previous sections. If many types support the same set of operations, you can write functions that work on any of those types.

For example, the <tt>multadd</tt> operation (which is common in linear algebra) takes three parameters; it multiplies the first two and then adds the third. We can write it in Python like this:

This method will work for any values of <tt>x</tt> and <tt>y</tt> that can be multiplied and for any value of <tt>z</tt> that can be added to the product.

We can invoke it with numeric values:

Or with <tt>Point</tt>s:

In the first case, the <tt>Point</tt> is multiplied by a scalar and then added to another <tt>Point</tt>. In the second case, the dot product yields a numeric value, so the third parameter also has to be a numeric value.

A function like this that can take parameters with different types is called polymorphic.

As another example, consider the method <tt>front_and_back</tt>, which prints a list twice, forward and backward:

Because the <tt>reverse</tt> method is a modifier, we make a copy of the list before reversing it. That way, this method doesn't modify the list it gets as a parameter.

Here's an example that applies <tt>front_and_back</tt> to a list:

Of course, we intended to apply this function to lists, so it is not surprising that it works. What would be surprising is if we could apply it to a <tt>Point</tt>.

To determine whether a function can be applied to a new type, we apply the fundamental rule of polymorphism: If all of the operations inside the function can be applied to the type, the function can be applied to the type. The operations in the method include <tt>copy</tt>, <tt>reverse</tt>, and <tt>print</tt>.

<tt>copy</tt> works on any object, and we have already written a <tt>__str__</tt> method for <tt>Point</tt>s, so all we need is a <tt>reverse</tt> method in the <tt>Point</tt> class:

Then we can pass <tt>Point</tt>s to <tt>front_and_back</tt>:

The best kind of polymorphism is the unintentional kind, where you discover that a function you have already written can be applied to a type for which you never planned.

Exercises
<ol> <li> Convert the function <tt>convertToSeconds</tt>:

to a method in the <tt>Time</tt> class.</li> <li> Add a fourth parameter, <tt>end</tt>, to the <tt>find</tt> function that specifies where to stop looking. Warning: This exercise is a bit tricky. The default value of <tt>end</tt> should be <tt>len(str)</tt>, but that doesn't work. The default values are evaluated when the function is defined, not when it is called. When <tt>find</tt> is defined, <tt>str</tt> doesn't exist yet, so you can't find its length. </li></ol>