Programming Language Concepts Using C and C++/Exception Handling

Exception and error handling in C, if there is any, is done through special return values that cannot be produced by normal completion of the function call. Take printf or one of its friends, for instance. If everything is OK a call to one of these functions normally returns the number of characters sent to the output stream. In case of an error, is returned&mdash;a value that would never be returned if everything went OK.

Or, say you implement a function that finds the number of occurrences of a particular item in a container. That would normally mean returning zero or a positive integer. What if the container does not exist and you want to provide this as a feedback? Returning zero doesn’t help; it means there is no item in the container. A positive value will not be of any help, either. How about returning a negative value, such as -1?

That answers our worries related to handling exceptional conditions. But writing code to the specification will not be very much fun. Consider the following code fragment, where the programmer does her best to provide controls after each function call.




 * }

Looks messy, uh? Figuring out the control flow is an unbearable task of plowing through the countless ’s and  ’s. Code is cluttered with error handling code, which makes it a nightmare for the maintainers. It almost makes you think it couldn’t get any worse.

If only we could guarantee a failure-free program, the above code would be cut down to two lines and be much easier to read and understand. Or, if only we could isolate error handling code from the rest that would be great. That’s exactly what is done by exception handling mechanism found in many languages: isolate parts of the code that deal with unexpected conditions so that programs can be more easily maintained.

This, however, is not possible in C. In C, you must either take the well-known path [and fill your code with a zillion ’s and  ’s] or use the  -  pair. In this handout, we will take a look at the latter and provide introduction to two complimentary notions, assertions and signals.

But, before we move on&mdash;in order to give some inspiration on the inner workings of exception handling mechanism&mdash;we’ll say a few words about how exceptions are handled in Win32 operating systems.

Exception Handling in Win32: Structured Exception Handling (SEH)
Fundamental to exception handling in Win32 is the registration of exceptions. This is done by inserting an exception record into a linked structure, the first node of which is pointed to by FS:[0]; as -  blocks are entered and left, new records are inserted and removed, respectively.

For a simple presentation, compile (and link) the following C program and run the executable.

Exc_Test.c

Next function contains the handler code for, which means the thread has tried to access some out-of-reach memory. This can happen as a result of attempting to write to a read-only section of memory or reading from some region without appropriate rights. As this exception is handled by moving some valid address value into, we return with   meaning the instruction that generated the exception will be tried once more.

If the thrown exception is not an, control is transferred to the next handler in the list by returning ExceptionContinueSearch.

, not surprisingly a CPU-dependent structure, is made up of fields containing the machine state and  is meant to hold the snapshot taken at the time the exception takes place. Complementing this is the CPU-independent structure holding information about the most recently raised exception, such as its type, address of the instruction it occurred, and so on. When an exception occurs, the operating system pushes these structures, together with a third one containing pointers to each of them, on the stack of thread that raised the exception.

Here is our second handler function, which deals with the [integer] divide-by-zero exception. Nothing new with the code!

One point worth mentioning, though: handler functions could have been merged into a single one. There is no rule saying that each and every exception must have its own handler function. In our case, the following function would have served the purpose, too.




 * }

The following assembly block registers handlers for probable exceptions. It does this by inserting exception registration records&mdash;one for access violation and one for divide by zero&mdash;to the front of the linked list used to hold information about exception handlers. This [exception registration] structure has two fields: a pointer to the previous structure&mdash;that is, the next node in the list&mdash;and a pointer-to-the callback function to be called in case the exception occurs.

This is pretty much the code fragment that a Win32 compiler would produce on entry to a - : it registers code of the related  -blocks as   by adding them to the head of a list whose first item is pointed by the value contained in. Upon completion of the -block any handler(s) registered on entry are removed from the list.

By the time control reaches this point, a partial image of memory related to exception handling will be as given below. Pointer emanating from the record of  points at the default handler that will be called to handle any unhandled exception.



Next, we try to store 1 into the four-byte memory region starting address of which is contained in. Our attempt will however result in an exception, since  contains zero and address zero is off-limits to processes in Win32 systems.

Next assembly block removes the exception records inserted on entry to. This is roughly equivalent to the code a Win32 compiler would generate upon exiting from a -  block.

Now that  has been removed from the list, divide-by-zero exception thrown in the following block will be handled by the default handler (namely, the   system function) which simply displays the well-known, most annoying message box.


 * cl /w /FeExcTest.exe Exc_Test.c /link /SAFESEH:NO↵
 * ExcTest↵
 * In the handler for INT_DIVIDE_BY_ZERO
 * Cannot handle exceptions other than div. by zero
 * Moving on to the next handler in the list
 * In the handler for ACCESS_VIOLATION
 * Handled access violation exception...


 * Handled divide by zero exception...
 * Will intentionally try dividing by zero once more!

Insert figure

Exception Handling in C Using Microsoft Extensions
Using Microsoft extensions you can write [non-portable] C programs with exception handling, albeit somewhat differently. The following is a simple example to this.

Exc_SEH.c

Following is an example to exception filter functions. An exception filter decides what kind of action is to be taken when an exception occurs. In doing so, it may react differently depending on the type of the exception. For instance, our filter function offers some remedial action in the case of an access violation exception while for others exception types it defers the decision to the next handler in the list.

Similar to the constructs provided at the programming language level, Microsoft SEH has the notion of a guarded region that is followed by handler code (  or  ). One major difference is the number of handlers: in SEH one can have either one of  or   following a particular   and only once.

Before executing the guarded region, related handler–lines between 18 and 22–is registered with the operating system by means of the compiler-synthesized code, which is roughly equivalent to lines 39-46 of the previous section. Following this the guarded region is executed and results in an exception, which is filtered through the expression passed to. In our case, filter decides to go on with execution of the current handler if it’s a divide-by-zero exception; otherwise, it defers the decision to the next handler in the list.

By the time control reaches line 23, this same handler will have been unregistered thanks to the code synthesized by the compiler, which is roughly the same with lines 57-64 of the previous section.

Instead of a plain filter, next  has a filter function, which serves exactly the same purpose.

Since all user-written handlers are pulled down, exception due to execution of line 29 will be handled by the default handler offered by the operating system, which is responsible for handling all unhandled exceptions.


 * cl /FeExcSEH.exe Exc_SEH.c↵
 * ExcSEH↵

Interface
List.h

The following directive is needed to bring in the type definitions and function prototypes needed to make jumps across function boundaries. Apart from the prototypes for  and , this header file includes the definition for the buffer type whose instances are used to communicate between the aforementioned functions.

is a machine-dependent buffer used to hold the program state information. What we do is basically fill in this buffer before we execute a block of code that can possibly produce an exception and, in case an exception is raised, use the values found in it to produce the environment that existed prior to the execution of the block.

ListExtended.h ListIterator.h

Implementation
List.c

Déjà vu all over again: A handle pointing to some metadata, which in turn contains a pointer to the container that holds the components of the collection.

This recurring pattern is a good starting point when you design a collection. Handle is there to prevent users from manipulating the data structure directly and is also useful for maintenance since it takes up constant amount of memory; metadata is there to hold the state information about and/ or attributes of the structure; and container is there to physically hold the data.

Observe we make use of a type (struct _LIST) that is yet to be defined. Compiler does not complain a bit about it because we don’t use the type itself but a pointer to the type, which has a known size.

What follows is the definition of a single list node. This definition, which is tightly coupled with the list, could have been moved into the list type definition as shown below.

struct LIST { COMPARISON_FUNC compare_component; COPY_FUNC copy_component; DESTRUCTION_FUNC destroy_component; unsigned int size; struct _LIST { Object info; struct _LIST* next; struct _LIST* prev; }* head; };

Java version of the following definition involves using the type name being defined while in C this is impossible. In C, a type name cannot be used before its definition is completed. This is not a problem for pointers, though. Whatever the size of the object they manipulate, their size does not change. For this reason, whenever we use recursion in the description of a data type it is likely to be formulated with a pointer.

All operations applied on a  object will effectively act on an underlying   object. Now that we can have more than one  object in an application and a particular iterator can be used to traverse only one of them, in addition to the iterator specific fields, our structure contains a reference to the enclosing   structure.



This is similar to what gets done in the implementation of inner classes in Java: when transformed into Java 1.0 for the purpose of generating Java virtual machine bytecodes, the signature of each non-static inner class constructor is modified so that it receives the enclosing instance as a first argument. As soon as the constructor takes control, this value is stored in a  field. For example,

List.java


 * }

will be transformed into

List.java


 * }

Remark the access specifier for the  class: package-friendly, not. This is due to the fact that top-level classes cannot be  or. But then, why not use  instead? The answer is because there can be only one  class in a single file and that happens to be the   class.

But, promoting  from   to package-friendly means it can be used by classes for which it was not intended to. The answer lies in usage of the synthesized class name: such a name cannot be used in the source code and this is enforced by the compiler.

is a function that provides for jumps across function boundaries. In doing so, it makes use of the  structure formerly filled in by a call to the   function.

In a way  can be seen as 'a jump on steroids'. In addition to jumping to some other instruction [probably in a different function], it ensures that the program is in a valid state by restoring the machine state with the values found in the  structure. So, the following  will jump to the location where the corresponding   was issued and unwind the runtime stack in the meantime. While doing so it returns information about the nature of the exceptional condition in the second argument passed to it, which in this case is supposed to reflect the fact that the list we want to peek in is an empty one.

Note that this unwinding process does not undo any modifications made on memory or the file system; it simply re-establishes the machine state as it was before, any changes made on the contents of primary and secondary memory is retained. In case you may need a more complete unwind you must take charge and do it yourself.

This type of behavior is similar to that of an exception: when a thrown exception is not handled in the current subprogram, runtime stack is unwound and control is transferred back to the callee&mdash;that is actually a jump across the current function’s boundary&mdash;hoping it will take care of the exception.

Executing the  command will fill in the   argument passed to it with values that reflect the machine state at the point of its invocation. As it completes,  will return 0. Next step is to call the function that may give rise to an exceptional situation. If everything goes our way, control will return to the statement following the function invocation, which is a return in our case. Otherwise, control will return to the point where  was issued and return the value returned by the invocation of   in   as the result of calling the.



We can summarize what happens as follows:


 * Record the machine state by a call to  and return 0.
 * Call the function that may give rise to an exceptional condition.
 * If everything is OK, return a legitimate value from the function by a  statement. Control will flow as if there were no special arrangements made. Otherwise, return from the function by a   function. Doing so will (in addition to unwinding the runtime stack) return control to the point where   function was called. Pretend as if   were not called before and return as its result the value that was returned in the second argument of the   function.

Robustness and Correctness of a Program
In our test program we provide simple examples to the use of two notions complementing the exception concept: assertion facility and signals. The former is used to get the compiler to insert run-time controls into code and therefore can be seen as a tool to build correct software. The latter is a notification to a process that an event has occurred. Put differently a signal is a software interrupt&mdash;due to some unexpected condition in the computer, operating system, or a process in the system&mdash;delivered to a process.

Assertions
Assertion facility and exception handling serve similar purposes; they do not serve the same purpose. Although both facilities increase reliability they do so by addressing different aspects: robustness and correctness. While exception handling enables a more robust system by providing an ability to recover from an unexpected condition, assertions help ensure correctness by checking the validity of claims made by the developer.

Definition: Robustness pertains to a system’s ability to reasonably react to a variety of circumstances and possibly unexpected conditions. Correctness pertains to a system’s adherence to a specification that states requirements about what it (system) is expected to do.

For example, a data file missing on disk has nothing to do with the correctness of the implementation. Programmer is better off providing some code dealing with this exceptional condition. On the other hand, an illegal argument value passed to a subprogram is very likely the result of a previous mistake made, such as sloppy range-checking, in the program code.

Signals
Signals serve to complement exceptions in a different sense: while exceptions are due to the unexpected consequences of activities of the current program [and therefore is a programming language concept and can be said to be internal] signals&mdash;an operating system concept&mdash;are generally external and can be due to hardware, operating system, or processes.

Relation between exceptions and signals can be better understood by an example. Consider integer division. If the divisor happens to be zero the processor will generate a division-by-zero interrupt, which is further relayed by the operating system to the language runtime as a  signal. This signal is finally cast by the language runtime into an exception, such as, and passed back to the program containing the culprit code, hoping it will be caught and handled in this program.

Signals can be examined in three categories: program errors, external events, and explicit requests. An error means the program has performed something invalid and this has been detected by the operating system or the computer. This includes division by zero, dereferencing null pointer, dereferencing uninitialized pointer, and so on. An external event has to do with I/O or communication with other processes. Examples to this category are expiration of a timer, termination of a child process, stopping or suspending a process, yielding the user terminal to a background job for input or output, and so on. An explicit request is a library function call that specifically generates a signal, such as the abort and the kill functions, which basically generate  and , respectively.

Examples of signals include:


 * (Interrupt)
 * Upon receiving the special interrupt key, generally Ctrl-C, a  signal is sent to the process.
 * This interrupt immediately terminates the process. There is no way to block, handle, or ignore this interrupt. A typical use of this signal and the previous one is to terminate a program that has entered an infinite loop or a program that is taking too much of the system resources.
 * Similar to, this interrupt causes program termination. Unlike  , however, it is possible to block, handle, or ignore this interrupt.
 * (Segmentation violation)
 * Program is trying to read/ write the memory that is not allocated for it. This signal may be generated when an array is used with an out-of-bounds index value or an uninitialized pointer with a properly aligned initial value is dereferenced.
 * (Illegal Instruction)
 * An illegal or privileged instruction has been encountered. Such a signal may be raised as a result of executing a binary that is meant for a different processor, which can be exemplified by running an executable containing Pentium-4 instructions on an i386. One other cause is attempting to execute a corrupted executable. Finally, trying to execute a privileged instruction- such as those manipulating the system tables- in a user program will also give rise to generation of this signal.
 * An attempt to access invalid address has been made. This is probably due to trying to access misaligned data, which may happen when an uninitialized pointer with a random value in it is dereferenced.
 * A general arithmetic exception has happened. Note that this is not restricted to floating-point numbers, as its name may suggest. It can be integer division-by-zero, overflow, and so on.
 * This signal is generated whenever a previously set timer expires.
 * and
 * Set aside for the programmer, these signals can be tailored to meet any requirement.
 * A general arithmetic exception has happened. Note that this is not restricted to floating-point numbers, as its name may suggest. It can be integer division-by-zero, overflow, and so on.
 * This signal is generated whenever a previously set timer expires.
 * and
 * Set aside for the programmer, these signals can be tailored to meet any requirement.
 * Set aside for the programmer, these signals can be tailored to meet any requirement.

When a signal is raised (or generated) it becomes pending, which means it is awaiting to be delivered to the process. If not blocked by the process this usually takes a very short time and upon delivery a certain routine is performed for dealing with the signal. This is called “handling the signal” and may take the form of simply ignoring it, accepting the default action offered by the system, or executing the user specified action. If the signal is blocked, meaning its handling is indefinitely deferred to a later time, it stays in a (blocked) pending state. This signal can later be unblocked to be handled by the process.



It should be noted that some signals can neither be blocked nor be ignored. As a matter of fact, some&mdash;, for instance&mdash;cannot even be handled with a user-specified action.

Test Program
List_Test.c

Next line of code makes a claim about our program:, which was previously created from an array of four elements, has a size of four. Otherwise would mean a semantic error in the constructor or the  function and must be rectified before the   module hits the market.

If our claim is falsified by the program state  will give rise to an abortion with a diagnostic message written on the standard output, which includes the "stringified" version of the claim together with the file name and line number.



! style="text*align: left;" | Example: Implement in C a scheme that ensures out-of-bounds index values will not be silently ignored.
 * }

This, of course, comes with a price: each time an assertion is seen extra time is spent for verification of the claim, which implies code full of s will run slower. Given that a production quality code is not only expected to be fault-free but also fast, having s in a finalized project simply does not make sense. So, what? Are we supposed to remove all of the assertions&mdash;probably tens of them&mdash;just to put them back in case maintenance is required? Not really! Defining the  macro in our file or at the command line as a compile-time switch will solve the problem.

Yet another mechanism for handling unexpected conditions: signals. Originally a UNIX concept, signals are used to inform processes&mdash;that is, running programs&mdash;about the occurrence of [usually] asynchronous events, such as an invalid memory access or an attempt to divide by zero.

Apart from triggering a signal by the error-detection mechanism of the computer or an action external to the program, one may also "raise" a signal explicitly by using the  function. All these cause the relevant signal to be sent to the process. Once a signal is raised&mdash;or triggered in some way&mdash;it needs to be handled. This is done by calling the handler of the signal.

That’s all fine but how do we change the reaction of our program to a particular signal? After all, upon occurrence of the same signal, we may sometimes be able to modify the environment and let the program continue while at other times we must simply abort the program. We can achieve this by changing the handler function, which we do by the  function. Associating a new handler with the signal, this function returns a pointer to the previous handler. From this point on, unless another call to  re-modifies the handler, all relevant signals will end up being processed in this new function.

In establishing a handler one can pass two special values as the second argument to :   and. The former means the signal passed in the first argument is to be ignored whereas the latter means the signal will receive its default handling.


 * gcc –c List.c –ID:/include↵   # In Cygwin

For creating the executable during the test phase we should use the following command. This will effectively enable the assertion facility.


 * gcc –o Test.exe List_Test.c List.o –lWrappers –ID:/include –LD:/library↵

Getting rid of the semantic errors in the program&mdash;or at least hoping so&mdash;we can de-activate the assertion facility. This is done by defining the  macro.


 * gcc –DNDEBUG –o Test.exe List_Test.c List.o –lWrappers –ID:/include –LD:/library↵

Interface
Stack.h

Implementation
Stack.c

Since by the time control reaches this point we will have assured the stack is not empty&mdash;by checking the underlying list as in the previous if statement&mdash;there is no point in passing a buffer, which is used to convey information about any problem. We therefore pass  as the second argument.