Polymorphic Data Structures in C/Pointers

Pointers are one of the most essential constructs in C. A pointer is a variable that stores the address (in memory) of another variable for reference by a function. Through the use of pointers, we are able to achieve a higher level of data manipulation, since modifying static variables in functions outside the ones that created them is not supported by any algorithm developed in C.

Simple Pointer Operations
Developers can explicitly declare pointers using the pointer operator (*): The standard convention used in this book is to prepend the variable being pointed to with "p_" to designate its status as a pointer. This is helpful when there are multiple variables of the same apparent type in a function, where some are pointers and some are not. In order to access the information pointed to by a pointer, a programmer must use the dereference operator (also the asterisk) to denote the reference. If a variable is not a pointer, but needs to be modified inside a called function, then the address-of operator (&) can be used to create an "on-the-fly" pointer that is passed to the function. Many programmers understand that the scanf function needs the address-of operator in front of the variables read into, but not all of them know why. It is because scanf needs to affect the variable where the information is being stored.

Pointers are usually used for keeping track of dynamically allocated memory. In this demonstration, the pointer is redundant, but it will illustrate the principles of pointers. First, an integer i is created as a static variable of main. Then, the pointer p_i is created and its value is set to the address of i. Next, i is assigned the value of 5 and printed. Finally, the integer that p_i points to is assigned the value of 7 and printed.

Notice the use of dereference and address-of operators. The proper use of pointer, dereference and address-of operators is key in an understanding of polymorphism.

Dynamic Memory Allocation
Perhaps the most important use of pointers is to keep track of memory that has been allocated dynamically (at run-time). Memory is allocated dynamically using the <tt>malloc</tt> function. <tt> malloc</tt> has one argument (the amount of space to be allocated, in bytes) and returns a pointer to a <tt>void</tt>, which is usually recast into a pointer to some other type. The challenge with dynamic memory allocation is that type sizes can be machine-dependent, meaning an integer on one computer may not be the same size in memory as an integer on another. To overcome this, C describes a <tt>sizeof</tt> operator which takes a type name as a parameter and returns the size, in bytes, of that type as an integer. For example, in order to dynamically allocate space in memory for one character, a developer would write: More complexly, dynamic memory allocation is usually used when the amount of space to be reserved is not known at the time of compilation. In the following program, the user is asked how many integers they would like to store, and stores them. Notice how <tt>stdlib</tt> was included in the header. <tt>malloc</tt> is declared in <tt>stdlib.h</tt>, so it has to be included along with <tt>stdio.h</tt> in programs that use it.

Notice also how the space reserved by <tt>malloc</tt> is referred to using the standard syntax for arrays. This is because <tt>malloc</tt> reserves space sequentially (all in a row), so pointer arithmetic (discussed below) can be applied to it. In other words, when <tt>malloc</tt> reserves space for more than one of a given type, it allocates space for an array of that type.

Pointer Operations on Structures
In addition to the standard member operator (.), ANSI C includes a different member operator symbol, <tt>-></tt>. It performs two operations at the same time: it dereferences the structure name before the operator and evaluates its member (after the operator). This is helpful when using pointers to refer to the structure. Using the employee data structure from the previous chapter: The bottom two statements perform exactly the same operations. Because we will be working with pointers to structures for the majority of this book, the dereference-member operator will be used more often. Note that sub-members must still be referred to using the normal member operator.

Using Pointers to Return Values
Functions in C can only return one value. This is an unfortunate consequence of the procedural programming paradigm. However, sometimes multiple variables need to be changed. In order to do that, one must pass a pointer to that variable to the function called. For more on this topic, please refer to Appendix 1, Function Invocation. The information there is extremely valuable, make sure to go through it before moving on. The remainder of this book assumes that the reader has read and understands the contents of Appendix 1.

Pointer Arithmetic
Performing pointer arithmetic is a very simple concept, but a somewhat difficult practice. Performing pointer addition on arrays is done implicitly, with the subscript operators []. The following statement would return TRUE: The addition to a pointer causes the reference address to be "moved forward" by the specified number of units, multiplied by the size of the type. So, <tt>&A[2]</tt> is the location of A[0], plus the size of two integers in memory (thus giving you the third item in the array, since arrays start indexing at 0). Pointer arithmetic can be applied to unions and structures as well, but this method is not used as often in favor of member referencing.

Pointers to Functions
A very powerful feature of the C language is the ability to take the address of a function and store it as a pointer. In this manner, one could theoretically write a function, then have that function call a different function that is known only at runtime (decreasing the amount of conditional programming needed). Every C programmer is familiar with the standard function declaration syntax: Similar to variables, functions are also typed. Assuming they are not of type <tt>void</tt>, all functions return data, and that data must have type. As a side effect of this structure, a programmer can declare pointers to functions. This code can be used to execute any number of functions in the same line of code, demonstrated below in this text manipulation program. In this program, there are two functions, <tt>print_reverse</tt> and <tt>print_normal</tt>. One simply prints the string, and the other prints the string's contents out in reverse order. But the actual function call inside <tt>main</tt> is the same in both cases: This is made possible through C's ability to reference functions by pointers to that function, and to properly pass parameters to that function.