Programming Language Concepts Using C and C++/Object-Based Programming in C++

Initially called C-with Classes, C++ has the class notion incorporated into the language. This is good news: some of the conventions and programming practices we had to adopt in C are now enforced by the compiler. Instead of having a header file containing the [intended] interface of an abstract data type, we now have a class definition in the header file, which is still intended to contain the interface; implementation details-such as utility functions and structure of the object- are now listed as private parts of the class definition, rather than deferring their definitions to the implementation file; constructor-like functions are replaced with true constructors that are implicitly called by the compiler-generated code.

The above list can be extended with other examples. But the point remains the same: object-based programming is much easier in C++. The only pitfall is giving in to the simplicity of procedural paradigm and writing C++ code as plain old C. Apart from this, the following presentation should be a no-brainer for the initiate.

Interface
Complex Access restriction to class members is specified by the labeled,  , and   sections within the class body. The keywords,  , and   are called access specifiers.
 * 1) A   member is accessible&mdash;regardless of the point of reference&mdash;from anywhere in the program. Proper enforcement of information hiding limits   members of a class to functions that can be used by the general program to manipulate objects of the class type.
 * 2) A   member can be accessed only by the member functions and friends of  its class. A class that enforces information hiding declares its data members as.
 * 3) A   member behaves as a   member to a derived class and friends of its class, and behaves as a   member to the rest of the program.

A class may contain multiple,  , and   sections. Each section remains in effect until either another section label or the closing right brace of the class body is seen. If no access specifier is specified, by default the section immediately following the opening left brace of the class body is.

Definition: A default constructor is a constructor that can be invoked without user-specified arguments.

In C++ this does not mean it cannot accept any arguments. It means only that a default value is associated with each parameter of the constructor. For example, each of the following represents a default constructor:

In C++, a constructor that can be invoked with a single parameter serves as a conversion operator. It helps us provide an implicit conversion from values of the constructor's parameter type to objects of the class, which is later used by the C++ compiler in the process called function overload resolution.

In the current class, passing a  to the constructor will convert it to a   object with a zero imaginary part. A very convenient tool! After all, isn’t a real number a complex number with no imaginary part? However, this convenience may at times turn into a difficult-to-find bug. For instance, when passed a single  argument, the following constructor will act like a conversion operator from   to. It will likely produce unexpected results.

When invoked with a single argument, this constructor will create a  object with the real part set to zero. That is,  will correspond to 3i, not 3! In order to avoid such unwanted conversions while keeping the parameter list the same, you must modify the signature with the  keyword as in the following.

Such a use disables implicit conversion through that constructor. Note the  keyword can apply only to constructors.

Definition: An implicit type conversion done by the compiler is called coercion.

Thanks to this implicit conversion done through the following constructor, whenever a function or an operator accepts a  object we will be able to pass an argument of type. Take the function signature on line 12, for instance. In addition to assigning a  object to another, this operator now enables assignment of a   to a   object. Because this  value is first [implicitly] converted to a   object and then assignment of this resulting object is performed.

Definition: A copy constructor initializes an object with the copy of a second. Usually, it takes a formal parameter of a reference to a  object of the class.

Note that this constructor, like the previous one, does not have a return type. This is not a typo! Constructor(s) (and the destructor, if there is any) of a class must not specify a return type, even that of.

In addition to function name overloading, C++ provides the programmer with an operator overloading facility. Overloaded operators allow objects of class type to be used with the built-in operators defined in C++, allowing their manipulation to be as intuitive as that of built-in types.

For overloading an operator, a function with a special name, which is formed by prefixing the word  to the operator symbol, must be defined. It should be kept in mind that arity, precedence, and associativity of the operator cannot be changed. For unary operators, the object receiving the message corresponds to the sole operand of the operator; for the rest, operand correspondence is established in a left-to-right fashion. According to the following declaration, for instance, the receiver object corresponds to the left-hand side of the assignment while the [explicit] formal parameter corresponds to the right-hand side.

That’s right! Unlike C, C++ has type support for Boolean values. Unfortunately, it [C++] keeps supporting C-style semantics for Boolean expressions. In other words, you can still use integral values instead of Boolean values. But then again you can be a good boy and start using variables of type.

If assignment and addition operators are overloaded, users of the current class will naturally look for the corresponding overloaded compound assignment operator,.

Modifying the explicit arguments to be constant is not a big deal. Simply insert somewhere before its position the  keyword and that will be all. But, what if we want to make the receiver object&mdash;that is, the object the message is being sent to&mdash;constant? This argument, passed as the implicit argument to the function, cannot be modified in a similar fashion. Following signatures are examples to how this can be done: put the  keyword after the closing parenthesis of the parameter list and it will be taken to stand for the receiver object.

Programmers can define&mdash;that is, provide the function body&mdash;the member functions of a class either within the class (in the header file) or outside it (in the implementation file). Providing the function body in the header file may not be a good idea especially if it reveals implementation details [since this would mean violating the information hiding principle]. Nevertheless, for functions as simple as the next two, it is not such a bad idea.

Next three functions are not members of the  class. They are provided for complementing the class definition. Making the case for the equality test operator should convince us about the other two as well. With two variables of two possible types, we have four combinations of equality tests. Equality test of two  objects and [thanks to the conversion provided through the constructor serving as a user-defined conversion function] a   and a   are provided as class member functions. Equality test of two s is provided by the compiler. What is left is the test we need to make between a  and a   number. This is certainly not provided by the compiler. It cannot be provided as a class member function, either. Because the left-hand side operand is a  and we know that in the   class definition,  &mdash;the implicit parameter&mdash;is a constant pointer to a   object. So, we need to follow a different path to provide this functionality: A plain old global function taking a  and a   number as parameters.

An inline function has its source expanded into the program at each invocation point, thereby eliminating the overhead associated with a function call. It can therefore provide a significant performance gain provided that the function is invoked sufficiently many times.

A member function defined within the class definition&mdash;such as  and  &mdash;is by default inline and such a function need not be further specified as inline. A member function defined outside the class body or any global function, on the other hand, must be specified to be so at its point of definition by prefixing the function prototype by the  keyword and should be included within the header file containing the class definition.

Note that the inline specification is only a recommendation to the compiler. The compiler may choose to ignore this recommendation, because the function declared inline is not a good candidate for expansion at the point of call. A recursive function cannot be completely expanded at the point of call. Likewise, a large function is likely to be ignored. In general, the inline mechanism is meant to optimize small, straight-line, frequently called functions.

Notice the use of accessors, which actually contradicts our intentions of producing faster code by means of inlining. Accessing the fields directly, instead of through accessor functions, would have been faster and more in line with the  keyword on the next line. However, that's not possible. The object fields are declared to be&mdash;as expected&mdash;, which means no one outside the class, including code of other classes within the same file can manipulate it. Well, as a matter of fact, there is an exception. By declaring certain functions and/or classes to have special rights through the friend mechanism, one can gain direct access to the internals of a class. More on this is provided in the Exception Handling chapter.

Implementation
Complex.cxx

On certain occasions C++ compiler implicitly calls the default constructor. These are:


 * 1) All components of a heap-based array will be initialized using the default constructor of the component class.
 * 2) If not provided with an explicit constructor call in the member initialization list, sub-objects inherited will be initialized using the default constructor(s) of the base class(es).
 * 3) If not provided with an explicit constructor call in the member initialization list, non-primitive fields making up the object will be initialized using their default constructors.

For this reason, as part of the design process, one should always give serious consideration to whether a default constructor is required or not.

Thanks to the flexibility provided by default arguments, next three lines all use the same constructor.

As expected, C++ will coerce actual parameters of type  to  s and pass them to the appropriate constructor.

If not provided with a copy constructor compiler will make a copy of an object by calling the copy constructor of each instance field. For primitive and pointer types, this means bitwise-copying the field.

Definition: A class where all fields are bitwise-copied are said to support shallow copy.

For our example, shallow-copying serves the purpose. However, in cases where instance fields of the object point to other objects or need to be skipped or treated specially, such as a password field, we may need more than a shallow copy. Consider the linked implementation of a list. Will it suffice to copy the head and tail indicators or shall we have to copy the items too? The answer is: it depends. If you want the same list to be shared then shallow copy is the way to go. Otherwise, you need to override the default behavior of the compiler. Once overridden, the compiler will always use this function whenever a copy has to be made.

This constructor will be implicitly invoked whenever you pass an argument by value. Remember, an argument when passed by value will be copied on the run-time stack. Same can be said about returning an object from a function, which requires copying in the reverse direction. This very act of copying will be accomplished through the copy constructor. So, you should take some time deciding whether you need such a constructor or not.

Like other non-static member functions, overloaded operators take a pointer to the object on which the function is being applied. That is, given the declaration

can be seen as

which can further be seen as

The corresponding function definition has an implicit first formal parameter. This formal parameter named this can be used to refer to the object the message is being sent to. So, given the function definition

the compiler internally makes a function call to

whenever

is used in the code by transforming it into

Note that it is the pointer that is declared to be constant, not the contents of the memory pointed to by the pointer. The programmer can, directly or indirectly, change the object’s memory but cannot change the object that the function is being applied on.

Following function definition overloads the assignment operator. Such an operator is used to modify the contents of an already existing object with that of another. Unless provided with an overridden version, the compiler&mdash;similar to the copy constructor case&mdash;will call the default assignment operator of each and every field of the object. Otherwise, it will use one of the functions you provide. As to whether you need to override, considerations listed in the case of copy constructor apply and one should pay special care for making the right decision.

Reason why we use  as the return value [and parameter] type of the function is to facilitate the concatenation of the assignment operators in the least expensive and easiest way possible. For instance, it is possible to have a statement like

This statement is carried out by first assigning  to   and then   to. That is,  will first be used as the object to which the assignment message is sent and then as the parameter of another assignment message that is sent to.

Next two lines can be written without the keyword. Because all unqualified references to object’s memory is assumed to belong to the object pointed to by. So, it can be rewritten as:

Note also the use of. This is another indication of the fact that this is not the receiver object itself but rather a constant pointer to it.

Function Overload Resolution
As was mentioned in the annotations before the signature of the default constructor in the header file, function overload resolution is about figuring out the function to be invoked. A process ending with one of three possible outcomes&mdash;success, ambiguity, and no matching function&mdash;a function call is resolved in three steps.

Identification of the Candidate Functions
First step of function overload resolution involves identification of functions that have the same name with the function being called and are visible at the point of the call. This set of functions are also called candidate functions. An empty set of candidate functions gives rise to a compile-time error.

Use of types defined in namespaces as argument types and/or importing identifiers from namespaces may lead to an increase in the size of this set.

Function call on the last line will lead to a set of size three:,  , and. Without the using directive  will not be visible and therefore will not be included in the candidate functions set. However, replacing the  directive with the following code sequence will again cause this function to be included in the set.

As will later be discussed in the Inheritance chapter, depending on the language, introduction of a new scope may affect the set of candidate functions differently. In Java, for instance, the new set is formed by taking a union of the former set and the set introduced by the new scope; in case of identical signatures methods of the new scope shadow those found in the former. In C++, however, a function with a name clashing with one found in the former set replaces all function signatures having this name.

Selection of the Viable Functions
Second step consists of selecting the callable functions from the non-empty set formed in the first phase. This requires elimination of functions that do not match the number of arguments and their types. The set of functions obtained by the end of this phase is called the viable functions. An empty set of viable functions means no matching function and causes a compile-time error to be emitted.

Note that we are not seeking a perfect match here. As far as the number of arguments is concerned, default values increase the size of this set. For instance, a function call with a single argument can be directed not only to a function with a single parameter but also to functions with n parameters all of which, except the first, are guaranteed to have a default value. Similarly, for argument types, conversions that can be applied on the arguments are also considered. As an example to this, a  argument can be passed to a corresponding parameter of type ,  ,  , and so on.

For programmers from safer languages such as Java or C#, inclusion of  in this set may come as a surprise. Being a narrowing conversion, passing a  argument to a   parameter will probably lead to information loss and therefore is deemed to be a violation of the contract by these languages. In order to make this happen one must explicitly cast the argument to. It is an entirely different story in C++, however. A C++ compiler will happily consider this&mdash;not explicitly casting the argument&mdash;as a keystroke saving activity and accept it as a viable function. If this happens to be the one and only viable function, function call in question will be dispatched to.

Argument Conversions
This brings us to the topic of probable conversions applicable on an argument. In addition to an exact match, a C++ compiler expands the set of viable functions by applying a series of conversions on the arguments. These fall in two different categories: exact match and type conversion.

The exact match conversions are minor ones and can further be treated in four subgroups:


 * 1) lvalue-to-rvalue conversion: Seeing argument passing as a special case of assignment, where argument is assigned to the corresponding parameter,  this conversion basically fetches the value in the argument and copies it into the parameter.
 * 2) Array-to-pointer conversion: An array is passed as a pointer to its first element. Note this and the following conversion are basically a part of the C/C++ language.
 * 3) Function-to-pointer conversion: Similar to the previous item, a function identifier, when passed as an argument, is transformed into a pointer to function.
 * 4) Qualification conversion: Applicable only to pointer types, this conversion transforms a plain pointer type by modifying it with either one or both of  and   modifiers. It should be noted that no type conversion takes place in the case of an argument being passed to a parameter of the same type with one or two of these modifier.

Second category of argument conversions, type conversions, can be examined in two groups: promotions and standard conversions. The former group, also called widening conversions, contain conversions that can be performed without information loss. These include the following:


 * 1) An argument of type ,  ,  , and   is promoted to  . If the compiler used supports a larger size for  ,   is also promoted to  . Otherwise it is promoted to.
 * 2) An argument of type   is promoted to.
 * 3) An enumeration type is promoted to one of ,  ,  , or  . Decision is made by choosing the smallest possible type that can represent all values in the enumeration.

Second group of type conversions, standard conversions, are divided into five:


 * 1)  -to-  conversion and narrowing integral conversions.
 * 2)  -to-  conversion.
 * 3) Conversions made between floating point and integral types, such as  -to-,  -to- , and  -to-.
 * 4) Conversion of 0 to a pointer type and conversion of any pointer type to.
 * 5) Conversions from integral types, floating point types, enumeration types, or pointer types to.

Note that this long list of rules contains quite a few details that are due to "low-level" nature and systems-programming aspect of C++, which were inherited from C. For example, having no type to hold logical values, C takes care of it by adopting a convention used in low-level programming: zero stands for false and anything else is interpreted to be true. Hence is the transformation from other types to. Similarly, architectures tend to provide better support for word size data, which in C/C++ is called. A typical example to this is the amount of adjustment made while pushing data on the hardware stack. Unless the compiler packs them, all data pushed&mdash;read it as "all arguments passed"&mdash;smaller than or equal to the word size&mdash;read it as &mdash;will be adjusted to a word boundary. This basically means all such data will be widened to the word size, which explains the first item of promotions. Add to this the environment-dependent size of integral types and the complexity introduced by having two versions ( and  ) you can understand why it suddenly turns into a nightmare.

In the process of conversion C++ compiler can apply either one of two sequences. In the first sequence, which is called the standard conversion sequence, it is permitted to apply zero or one exact match conversion&mdash;with the exception of qualification conversion&mdash;followed by zero or one promotion or standard conversion, which may further be followed by zero or one qualification conversion. Second sequence involves application of a user-defined conversion function, which may be preceded and followed by a standard conversion sequence. If there need be this this sequence can be applied twice.

Finding the Best Match
In the final step of resolving the function call C++ compiler picks the viable function with the best match. Two criteria are used in determining the best match: conversions applied to the arguments of the best match function are no worse than the conversions necessary to call any other viable function; conversions on some arguments are better than the conversions necessary for the same arguments when calling the other viable functions. In case there turns out to be no such function, call is said to be ambiguous and causes a compile-time error.

In finding the best match compiler ranks viable functions obtained in the previous step. According to this ranking, an exact match conversion is better than a promotion and a promotion is better than a standard conversion. A viable function is given the rank of the lowest ranked conversion used in transforming the arguments to the corresponding parameters.

Test Program
Next program, apart from providing examples of the function overload resolution process, shows that in C++ one can create objects in all three data regions. In some object-oriented programming languages, such as Java, objects are always created in the heap. This "limitation" of Java- or put differently, this "freedom" offered by C++- can be attributed to the language design philosophy. Being a direct descendant of C, C++ provides alternatives and expects the programmer to choose the right one. Also a descendant of C, albeit a more distant one, Java tends to provide a simpler framework with fewer alternatives.

In this case, C++ offers programmers an alternative that does away with using pointers. Like variables of a C  or objects of a value type in C#, one can directly  manipulate objects of classes. In other words, manipulating objects indirectly by means of handles is not the only option. This means less space for the object. However, polymorphism&mdash;and therefore, object-orientation&mdash;is not an option anymore. After all, polymorphism requires that same message is dispatched to probably different subprogram definitions depending on the dynamic type of the object, which means we should be able to use the same identifier to refer to objects of different types. This further implies that memory required for the object indicated by the identifier may change. Since the static data region deals with data of fixed size we cannot possibly put the object in this part of the program memory. Similarly, size of memory allocated on the run-time stack should be known beforehand by the compiler, run-time stack is also out of question. We must conjure up a solution where both parties are satisfied: compiler is given a fixed-size entity, while the variable-size object requirement of inheritance is met. This is accomplished by creating objects on the heap, which is the only place left, and manipulating it through an intermediary. Enter the object handle!

So, enabling polymorphism is possible only if objects are created on the heap and manipulated through pointers. That explains why Java&mdash;like any other programming language claiming to be object-oriented&mdash;creates objects the way it does. But it doesn't offer any explanations for the lack of creating objects the C++ way. Answering another question&mdash;what benefit we get when we create objects the C++ way&mdash;will provide the explanation: We get faster and object-based&mdash;not object-oriented!&mdash;solutions. Since polymorphism is out of question and therefore dynamic dispatch is not required anymore, all our function calls can be statically dispatched, which by the way is the default in C++. But then it starts to be a little confusing with all these paradigms. In addition to procedural programming, thanks to its C heritage, and object-oriented programming, we now have object-based programming. What makes things worse is default mode of programing in C++, since default dispatch type is static, is object-based. In Java, the default dispatch type is dynamic and therefore default programming paradigm is object-oriented. This means the programmer, expecting to do some object-oriented programming, is not confused with the presence of alternatives; she does not need to tell the compiler that she wants to do object-oriented programming. Add to this the utilization of object-oriented paradigm in realizing the open-closed principle, this seems to be a safer choice for producing extensible software.

Complex_Test.cxx

The following line creates an object of class  in the static data region. This object will therefore exist throughout the entire execution of the program and its allocation/deallocation will be among the responsibilities of the compiler. This line could have been written as  or.

Note that the second form is possible only when we pass one argument to the constructor.

Next four instantiations create four  objects on the run-time stack. Each time a subprogram is invoked or a block is entered, objects local to the subprogram/block will be created and allocated on the run-time stack. Upon exit from the subprogram/block, the objects will be automatically deallocated through changing the value of the stack pointer, which points to the topmost frame on the run-time stack. So, the lifetimes of local objects are limited to the block that they are defined in.

Note the fourth object is created through the use of the copy constructor. By virtue of this statement,  and   both have the same object-memory configurations. But realize they are not the same objects.

Next line might at first look like a compile-time error. After all, there is no function adding a  object and an. As was mentioned in the previous section,  is converted by promoting it to a   and addition operation of a   object with a   is performed. But then you say: I cannot see such a function! Thanks to the constructor with default arguments this  value promoted to   is later passed to this constructor and a   object is constructed. Now that addition of two  objects has been defined request is fulfilled.

Next line creates an object of class  on the free store (heap). This is made known to the compiler by our use of the  operator, which also signals that the created object will be managed by the programmer.

is equivalent to the following pseudo-C++ code:

In other words, new operator first allocates the area needed for the object and then implicitly calls the appropriate constructor for initialization.

Upon completion of the next object creation, we will have the partial memory image given below. Observe objects have been created in all three data regions.



One should not mistake a pointer for the storage pointed to by the pointer. Although pointers come into and go out of existence as subprograms are invoked and returned from, the memory area pointed to by the pointers (if they have been allocated in the heap region) may outlive the invocation of the subprograms. This is because such areas are managed by the programmer and she may return it at any time she sees it fit. So, let’s repeat it once more: It is not the pointer that is dynamically managed but the area of memory that is pointed to by the pointer.

Assuming left-to-right evaluation order [for pedagogical purposes] next line will be carried out as follows:


 * 1)   is promoted to.
 * 2) Using     is subtracted from , which is now a   value. In the process,   is first converted from   to.
 * 3) Using     is added to the result obtained in step 2. As in the previous step,   is first qualified with.
 * 4) Utilizing the constructor with default argument values, compiler converts   to a   object.
 * 5) Using   subtract , now an object of  , from the result obtained in step 3.
 * 6) Finally, the value obtained in step 5 is assigned&mdash;by means of the programmer provided assignment operator [ ]&mdash;to the memory region pointed to by.

operator is used to deallocate heap memory acquired through. One should make sure that all unused memory is returned to the system for possible reuse.

is equivalent to

In other words, before the memory area pointed to by the argument is deallocated by, other resources used by the object are released in a special function called destructor. This may include anything that is not managed by the compiler. Examples are operating system resources such as file handles, sockets, semaphores, and so on; database connections managed by a DBMS; or other heap memory reachable from the pointer, which is managed by the programmer.

In C++, this special function is given the name of the class prefixed with a tilde. It can neither return a value nor take any parameters. That’s why it cannot be overloaded. Although we can define multiple class constructors, we can provide a single destructor to be applied to all objects of our class.

As a matter of fact, we may choose not to provide a destructor at all. This is the right decision when the objects of a particular class are known to utilize no outside resources. In other words, if the data members are contained by value&mdash;that is, there is no pointer field among the members&mdash;and no resource that lies outside the jurisdiction of the compiler is ever acquired, it is not necessary that we provide a destructor. For this reason, we don’t implement a destructor for the current class. All data members are contained by value. That is, we have the relevant information in the object itself, not pointers to some variable-sized information lying somewhere in the heap. The compiler can deal with such fixed size information by simply freeing the region pointed to by the argument of the  operator. But when it comes to dealing with variable-sized information or outside resources, the programmer must provide some extra help. And this is what we have the destructor for. Absence of this assistance means wasting precious system resources, which is very likely to lead to a crash. For this reason, one should seriously consider whether a destructor is needed or not.

Definition: Orthodox Canonical Form is a set of functions one should give special treatment in the process of implementing a class. These functions include: default constructor, copy constructor, assignment operator, equality-test operator, and destructor.

It should be underlined that automatic garbage collection does not relieve us from considering the need for a destructor-like function. A garbage collector solves part of the problem: it deals with deallocation of heap memory. We now do not have to think about whether data members are inline or not. All pertaining to heap data will be taken care of by the garbage collector. But what about other outside resources? They still need to be handled by the programmer in a destructor-like function, called finalizer.

Definition: In languages with automatic garbage collection, the implicitly called special function needed for cleaning up the non-heap outside resources utilized by an object is called finalizer.

As a final note, one must keep in mind that intimate relation between the destructor and the heap does not mean the destructor is called only upon deallocating a heap object. Even when the object in question does not have any outside resources, the destructor is called&mdash;this time, implicitly by the compiler-synthesized code&mdash;upon exiting a block (for local objects) or at the end of a program (for global or static local objects).