C Programming/Preprocessor directives and macros

Preprocessors are a way of making text processing with your C program before they are actually compiled. Before the actual compilation of every C program it is passed through a Preprocessor. The Preprocessor looks through the program trying to find out specific instructions called Preprocessor directives that it can understand. All Preprocessor directives begin with the # (hash) symbol. C++ compilers use the same C preprocessor.

The preprocessor is a part of the compiler which performs preliminary operations (conditionally compiling code, including files etc...) to your code before the compiler sees it. These transformations are lexical, meaning that the output of the preprocessor is still text.

Directives
Directives are special instructions directed to the preprocessor (preprocessor directive) or to the compiler (compiler directive) on how it should process part or all of your source code or set some flags on the final object and are used to make writing source code easier (more portable for instance) and to make the source code more understandable. Directives are handled by the preprocessor, which is either a separate program invoked by the compiler or part of the compiler itself.

#include
C has some features as part of the language and some others as part of a standard library, which is a repository of code that is available alongside every standard-conformant C compiler. When the C compiler compiles your program it usually also links it with the standard C library. For example, on encountering a  directive, it replaces the directive with the contents of the stdio.h header file.

When you use features from the library, C requires you to declare what you would be using. The first line in the program is a preprocessing directive which should look like this:


 * 1) include 

The above line causes the C declarations which are in the stdio.h header to be included for use in your program. Usually this is implemented by just inserting into your program the contents of a header file called stdio.h, located in a system-dependent location. The location of such files may be described in your compiler's documentation. A list of standard C header files is listed below in the Headers table.

The stdio.h header contains various declarations for input/output (I/O) using an abstraction of I/O mechanisms called streams. For example there is an output stream object called stdout which is used to output text to the standard output, which usually displays the text on the computer screen.

If using angle brackets like the example above, the preprocessor is instructed to search for the include file along the development environment path for the standard includes.


 * 1) include "other.h"

If you use quotation marks (" "), the preprocessor is expected to search in some additional, usually user-defined, locations for the header file, and to fall back to the standard include paths only if it is not found in those additional locations. It is common for this form to include searching in the same directory as the file containing the #include directive.

Headers
The C90 standard headers list:

Headers added since C90:

#pragma
The pragma (pragmatic information) directive is part of the standard, but the meaning of any pragma depends on the software implementation of the standard that is used. The #pragma directive provides a way to request special behavior from the compiler. This directive is most useful for programs that are unusually large or that need to take advantage of the capabilities of a particular compiler.

Pragmas are used within the source program.


 * 1) pragma token(s)


 * 1) pragma is usually followed by a single token, which represents a command for the compiler to obey. You should check the software implementation of the C standard you intend on using for a list of the supported tokens. Not surprisingly, the set of commands that can appear in #pragma directives is different for each compiler; you'll have to consult the documentation for your compiler to see which commands it allows and what those commands do.

For instance one of the most implemented preprocessor directives,  when placed at the beginning of a header file, indicates that the file where it resides will be skipped if included several times by the preprocessor.

#define</tt>
Each  preprocessor instruction defines a macro. For example, #define PI 3.14159265358979323846 /* pi */

A macro defined with a space immediately after the name is called a constant or literal. A macro defined with a parenthesis immediately after the name is called a function-like macro.

The <tt>#define</tt> directive is used to define macros. Macros are used by the preprocessor to manipulate the program source code before it is compiled. Because preprocessor macro definitions are substituted before the compiler acts on the source code, any errors that are introduced by <tt>#define</tt> are difficult to trace.

By convention, macros defined using <tt>#define</tt> are named in uppercase. Although doing so is not a requirement, it is considered very bad practice to do otherwise. This allows the macros to be easily identified when reading the source code. (We mention many other common conventions for using  in a later chapter, C Programming/Common practices).

Today, <tt>#define</tt> is primarily used to handle compiler and platform differences. E.g., a define might hold a constant which is the appropriate error code for a system call. The use of <tt>#define</tt> should thus be limited unless absolutely necessary; <tt>typedef</tt> statements and constant variables can often perform the same functions more safely.

Another feature of the <tt>#define</tt> command is that it can take arguments, making it rather useful as a pseudo-function creator. Consider the following code:

... int x = -1; while( ABSOLUTE_VALUE( x ) ) { ... } It's generally a good idea to use extra parentheses when using complex macros. Notice that in the above example, the variable "x" is always within its own set of parentheses. This way, it will be evaluated in whole, before being compared to 0 or multiplied by -1. Also, the entire macro is surrounded by parentheses, to prevent it from being contaminated by other code. If you're not careful, you run the risk of having the compiler misinterpret your code.
 * 1) define ABSOLUTE_VALUE( x ) ( ((x) < 0) ? -(x) : (x) )

Because of side-effects it is considered a very bad idea to use macro functions as described above.

int x = -10; int y = ABSOLUTE_VALUE( x++ );

If ABSOLUTE_VALUE were a real function 'x' would now have the value of '-9', but because it was an argument in a macro it was expanded twice and thus has a value of -8.

(#, ##)

The # and ## operators are used with the <tt>#define</tt> macro. Using # causes the first argument after the # to be returned as a string in quotes. For example, the command

will make the compiler turn this command
 * 1) define as_string( s ) # s

puts( as_string( Hello World! ) ) ; into

puts( "Hello World!" ); Using ## concatenates what's before the ## with what's after it. For example, the command

... int xy = 10; ... will make the compiler turn
 * 1) define concatenate( x, y ) x ## y

printf( "%d", concatenate( x, y )); into

printf( "%d", xy); which will, of course, display <tt>10</tt> to standard output.

It is possible to concatenate a macro argument with a constant prefix or suffix to obtain a valid identifier as in

make_function( bar )
 * 1) define make_function( name ) int my_ ## name (int foo) {}

which will define a function called <tt>my_bar</tt>. But it isn't possible to integrate a macro argument into a constant string using the concatenation operator. In order to obtain such an effect, one can use the ANSI C property that two or more consecutive string constants are considered equivalent to a single string constant when encountered. Using this property, one can write

eat( fruit )
 * 1) define eat( what ) puts( "I'm eating " #what " today." )

which the macro-processor will turn into

puts( "I'm eating " "fruit" " today." )

which in turn will be interpreted by the C parser as a single string constant.

The following trick can be used to turn a numeric constants into string literals

puts(num2str(CONST));
 * 1) define num2str(x) str(x)
 * 2) define str(x) #x
 * 3) define CONST 23

This is a bit tricky, since it is expanded in 2 steps. First  is replaced with , which in turn is replaced with. This can be useful in the following example:


 * 1) ifdef DEBUG
 * 2) define debug(msg) fputs(__FILE__ ":" num2str(__LINE__) " - " msg, stderr)
 * 3) else
 * 4) define debug(msg)
 * 5) endif

This will give you a nice debug message including the file and the line where the message was issued. If DEBUG is not defined however the debugging message will completely vanish from your code. Be careful not to use this sort of construct with anything that has side effects, since this can lead to bugs, that appear and disappear depending on the compilation parameters.

macros
Macros aren't type-checked and so they do not evaluate arguments. Also, they do not obey scope properly, but simply take the string passed to them and replace each occurrence of the macro argument in the text of the macro with the actual string for that parameter (the code is literally copied into the location it was called from).

An example on how to use a macro:
 * 1) include <stdio.h>


 * 1) define SLICES 8
 * 2) define ADD(x) ( (x) / SLICES )

int main(void) {  int a = 0, b = 10, c = 6;

a = ADD(b + c); printf("%d\n", a); return 0; } -- the result of "a" should be "2" (b + c = 16 -> passed to ADD -> 16 / SLICES -> result is "2")

One of the few situations where inline functions won't work -- so you are pretty much forced to use function-like macros instead -- is to initialize compile time constants (static initialization of structs). This happens when the arguments to the macro are literals that the compiler can optimize to another literal.

#error
The #error directive halts compilation. When one is encountered the standard specifies that the compiler should emit a diagnostic containing the remaining tokens in the directive. This is mostly used for debugging purposes.

Programmers use "#error" inside a conditional block, to immediately halt the compiler when the "#if" or "#ifdef" -- at the beginning of the block -- detects a compile-time problem. Normally the compiler skips the block (and the "#error" directive inside it) and the compilation proceeds.

#warning
Many compilers support a #warning directive. When one is encountered, the compiler emits a diagnostic containing the remaining tokens in the directive.

#undef
The #undef directive undefines a macro. The identifier need not have been previously defined.

#if,#else,#elif,#endif (conditionals)
The #if command checks whether a controlling conditional expression evaluates to zero or nonzero, and excludes or includes a block of code respectively. For example:

The conditional expression could contain any C operator except for the assignment operators, the increment and decrement operators, the address-of operator, and the sizeof operator.

One unique operator used in preprocessing and nowhere else is the defined operator. It returns 1 if the macro name, optionally enclosed in parentheses, is currently defined; 0 if not.

The #endif command ends a block started by,  , or.

The #elif command is similar to, except that it is used to extract one from a series of blocks of code. E.g.:

*/  : /* The optional #else block is selected if none of the previous #if or    #elif blocks are selected */ :  :
 * 1) if /* some expression */
 * 1) elif /* another expression */
 * 1) else
 * 1) endif /* The end of the #if block */

#ifdef,#ifndef
The #ifdef command is similar to, except that the code block following it is selected if a macro name is defined. In this respect,


 * 1) ifdef NAME

is equivalent to


 * 1) if defined NAME

The #ifndef command is similar to #ifdef, except that the test is reversed:


 * 1) ifndef NAME

is equivalent to


 * 1) if !defined NAME

#line
This preprocessor directive is used to set the file name and the line number of the line following the directive to new values. This is used to set the __FILE__ and __LINE__ macros.

Useful Preprocessor Macros for Debugging
ANSI C defines some useful preprocessor macros and variables, also called "magic constants", include:

__FILE__  => The name of the current file, as a string literal __LINE__  => Current line of the source file, as a numeric literal __DATE__  => Current system date, as a string __TIME__  => Current system time, as a string __TIMESTAMP__ => Date and time (non-standard) __cplusplus => undefined when your C code is being compiled by a C compiler; 199711L when your C code is being compiled by a C++ compiler compliant with 1998 C++ standard. __func__ => Current function name of the source file, as a string (part of C99) __PRETTY_FUNCTION__ => "decorated" Current function name of the source file, as a string (in GCC; non-standard)

Compile-time assertions
Compile-time assertions can help you debug faster than using only run-time assert statements, because the compile-time assertions are all tested at compile time, while it is possible that a test run of a program may fail to exercise some run-time assert statements.

Prior to the C11 standard, some people defined a preprocessor macro to allow compile-time assertions, something like:

The   Boost library defines a similar macro.

Since C11, such macros are obsolete, as  and its macro equivalent   are standardized and built-in to the language.

X-Macros
One little-known usage pattern of the C preprocessor is known as "X-Macros". An X-Macro is a header file or macro. Commonly these use the extension ".def" instead of the traditional ".h". This file contains a list of similar macro calls, which can be referred to as "component macros". The include file is then referenced repeatedly in the following pattern. Here, the include file is "xmacro.def" and it contains a list of component macros of the style "foo(x, y, z)".

The most common usage of X-Macros is to establish a list of C objects and then automatically generate code for each of them. Some implementations also perform any s they need inside the X-Macro, as opposed to expecting the caller to undefine them.

Common sets of objects are a set of global configuration settings, a set of members of a struct, a list of possible XML tags for converting an XML file to a quickly-traversable tree, or the body of an enum declaration; other lists are possible.

Once the X-Macro has been processed to create the list of objects, the component macros can be redefined to generate, for instance, accessor and/or mutator functions. Structure serializing and deserializing are also commonly done.

Here is an example of an X-Macro that establishes a struct and automatically creates serialize/deserialize functions. For simplicity, this example doesn't account for endianness or buffer overflows.

File star.def:

File star_table.c:

Handlers for individual data types may be created and accessed using token concatenation (" ") and quoting (" ") operators. For example, the following might be added to the above code:

Note that in this example you can also avoid the creation of separate handler functions for each datatype in this example by defining the print format for each supported type, with the additional benefit of reducing the expansion code produced by this header file:

The creation of a separate header file can be avoided by creating a single macro containing what would be the contents of the file. For instance, the above file "star.def" could be replaced with this macro at the beginning of:

File star_table.c:

and then all calls to  could be replaced with a simple   statement. The rest of the above file would become:

and the print handler could be added as well as:

or as:

A variant which avoids needing to know the members of any expanded sub-macros is to accept the operators as an argument to the list macro:

File star_table.c:

This approach can be dangerous in that the entire macro set is always interpreted as if it was on a single source line, which could encounter compiler limits with complex component macros and/or long member lists.

This technique was reported by Lars Wirzenius in a web page dated January 17, 2000, in which he gives credit to Kenneth Oksanen for "refining and developing" the technique prior to 1997. The other references describe it as a method from at least a decade before the turn of the century.

We discuss X-Macros more in a later section, Serialization and X-Macros.

de:C-Programmierung: Präprozessor Programmation C/Préprocesseur C/Compilatore e precompilatore/Direttive C/Preprocesor