Perl Programming/Functions

A Perl function is a grouping of code that allows it to be easily used repeatedly. Functions are a key organizing component in all but the smallest of programs.

Introduction
So far we have written just a few lines of Perl at a time. Our example programs have started at the top of a file and proceeded to the bottom of the file, with a little jumping around using control-flow keywords like if, else and while. In many instances, though, it is useful to add another layer of organization on our programs.

For example, when something goes wrong in a typical program, it prints an "error message". The code to print an error message might look like

Other messages might look slightly different, like

We could pepper our code with hundreds of lines just like this, and things would work just fine…for a while. But sooner or later, we'd want to separate out error messages from "status messages" that report harmless information about our program. To do this, we could prefix all of our error messages with the word "ERROR" and all of the status messages with "STATUS". The code for a typical error messages would change to

The problem is, with hundreds of error messages, it would be a hassle to go change them all. This is where subroutines can help out.

The Wikipedia defines a subroutine, as "a sequence of instructions that perform a specific task, as part of a larger program. Subroutines can be called from different locations in a program, thus allowing programs to access the subroutine repeatedly without the subroutine's code having been written more than once."

All that gobblydeegook means we can wrap up our error message code in one place, as follows:

Whenever something goes wrong in our program, we can activate, or call, this subroutine, with whatever message we prefer:

And see messages such as

ERROR: something bad happened

If we need to change the formatting of our error message, say, to include a few exclamation points, it's simple enough to change the subroutine:

This has been an admittedly simple example, and subroutines have a few other advantages, but that's it in a nutshell. Type it in one place, fix it in one place, change it in one place.

Now for a little more detail about subroutines. Much of what follows will use the following subroutine, which, if it isn't obvious already, adds two numbers together, and returns their sum.

Name
The first line of a function begins with the keyword sub followed by the function's name. Any string of letters and numbers that is not a reserved Perl word (such as for, while, <tt>if</tt> and <tt>else</tt>) is a valid function name. Subroutines with names that describe what the subroutine does make for easier-to-read programs.

Prototype (rarely used)
The optional <tt>($$)</tt> specifies how many arguments this subroutine expects. <tt>($$)</tt> says "this function requires two scalar values". Perl prototypes are NOT what most people with experience with other languages expect them to be: instead, the prototypes alter the context of the parameters to the subroutine.

It is possible to disable a prototype by calling the function with a leading <tt>&</tt>, but this is not recommended.

Prototypes are used only seldomly, as it's much easier to use the normal method of passing parameters.

Body
The body of a subroutine does the "work", and consists of three primary sections.

Reading arguments
The pieces of information handed to a subroutine are called arguments or actual parameters. For instance, in

<tt>3</tt> and <tt>4</tt> are the arguments to the subroutine <tt>add_two_numbers</tt>.

Perl passes arguments to a subroutine in an array represented by <tt>@_</tt>. Usually it is more convenient to give meaningful names to these arguments, so the first line of a function often looks like

that puts the contents of <tt>@_</tt> into two variables named <tt>$x</tt> and <tt>$y</tt>. <tt>$x</tt> and <tt>$y</tt> are called formal parameters. The distinction between formal parameters (arguments) and actual parameters is subtle and for the most part unimportant. It is described somewhat in the wikipedia article Parameter (computer science). Be careful not to confuse the special variable <tt>$_</tt> and <tt>@_</tt>, the array of arguments passed to a function.

Some subroutines don't require any arguments, for example

will print "<tt>Hello World</tt>" to STDOUT. This subroutine doesn't need any extra information about how to do its job and accordingly doesn't need any arguments.

Most modern programming languages save programmers the trouble of explicitly breaking the argument array into variables. Unfortunately, Perl does not. Which, on the other hand, makes writing subroutines with variable number of arguments very easy.

In programming context parameter means almost the same thing as argument (see parameter for details). The two are often confused with no loss of understanding.

Important note: global and local variables
Unlike programming languages such as C or Java, all variables created or used within Perl subroutines are, by default, global variables. This means that any piece of your program outside of your subroutine may modify these variables, and that your subroutine may be, unknowingly, modifying variables that it has no business modifying. In small programs this is often convenient, but as programs get longer this often leads to complexity and is considered poor practice.

The best way to avoid this trap is to place the keyword <tt>my</tt> in front of all your variables the first time they appear. This tells Perl that you only want these variables to be available inside the nearest enclosing group of curly braces. In effect, these local variables act as a "scratch space" for use within your subroutines that disappears when the subroutine returns. The line <tt>use strict;</tt> at the top of your program will instruct Perl to force you to use <tt>my</tt> in front of your variables, to prevent you from accidentally creating global variables.

An alternative to <tt>my</tt> that you may see in some older Perl programs is the <tt>local</tt> keyword. <tt>local</tt> is somewhat similar to <tt>my</tt>, but is more complicated to deal with. It's better to stick with <tt>my</tt> in your own programs.

"Scope" describes whether a variable is local or global, and a couple of other complexities. See scope for a technical discussion.

The interesting part of a subroutine
In the very middle of the subroutine you're likely to find the more interesting "guts" of it all. In our add_two_numbers subroutine, this is the part that actually does the adding

In this middle section, you can do just about anything your heart desires, arithmetic, printing to files, or even calling other subroutines.

The <tt>return</tt> statement
Finally, some subroutines "return" some piece of information, called the "return value", using the <tt>return</tt> keyword.

For example

will set <tt>$sum</tt> to 9 (the sum of 4 and 5).

<tt>return</tt> can also be used without any return value as a shortcut to leave a subroutine before getting to a closing <tt>}</tt>

Invoking subroutines
Subroutines may be declared anywhere within a Perl program. They may be invoked like so:

If no "<tt>&</tt>" prefix is used, parentheses are required unless the subroutine has been predeclared.

Functions calling functions
On their own, functions provide a major stepping stone towards good code, but combining functions together really unleashes their power.

As you might expect, calling a function from within another function doesn't look any different from calling a function from the part of your program that is sitting outside any curly-braces.

This function adds two numbers, then multiplies them by 3. Bear with us on the uselessness of these functions. As you build your own programs to solve your unique problems, you'll see their usefulness immediately.

The line

calls our function <tt>add_two_numbers</tt> and puts the result into our <tt>$sum</tt> variable. Easy stuff, huh?

In this function, we've actually written much more code than we need. It could be pared down to something much smaller, but equally readable:

Functions calling themselves — recursion
We've seen functions calling other functions, but one neat concept in programming is when functions call themselves. This is called recursion. At first it seems like this might cause a so-called infinite loop, but it's really quite standard programming.

In math, the factorial function, multiplies a positive integer by each positive integer less than itself. For example, "5 factorial" (usually written <tt>5!</tt>) is calculated by multiplying 5 times 4 times 3 times 2 times 1. Of course, the 1 doesn't change the result. The factorial function is useful in calculating things like the number of different possible ways to seat your relatives at the dinner table.

Factorial makes a natural example for recursion, though it can be written just as easily with a <tt>while</tt> loop.

The self-referential line here is

which calls the factorial function from within the factorial function. This would go on forever, but we have a sort of stop-sign that prevents it:

This stops the sequence of calls to factorial and prevents the never-ending infinite loop.

Written in a sort of longhand

factorial(5) = 5*factorial(4) = 5*4*factorial(3) = 5*4*3*factorial(2) = 5*4*3*2*factorial(1) = 5*4*3*2*1 = 120

We've just barely touched on recursion. For some programming problems, it is a very natural solution. For others, it's a little… unnatural. Suffice it to say, it's a tool every programmer should carry in his or her belt.

Functions vs procedures vs subroutines
While reading programming literature and talking to programmers, you might run across the three terms "functions", "procedures" and "subroutines". Most of the time these are used interchangeably, but for the purists out there:

A function always returns a value, and, given the same arguments, always returns the same value. Functions are a lot like the functions you may have used in math class.

A procedure, unlike a function, may return no value at all. Unlike a function, a procedure often interacts with its external environment beyond its argument array. For instance, a procedure may read and write to files.

A subroutine refers to a sequence of instructions that may be either a function or a procedure.

The official grammar
The syntax for defining a subroutine is:

sub NAME PROTOTYPE ATTRIBUTES BLOCK

If this makes any sense to you, you probably don't need to be reading this book ;)

Perl/関数