User:Rex Jaeschke/sandbox

The Linux trademark is owned by Linus Torvalds.

UNIX® is a registered trademark of The Open Group

= Prefaces to Editions =

Edition 1, 1989 (with light editing)

[Howard W. Sams, Hayden Books, ISBN 0-672-48428-5, 1989.]

Early in 1986, I was invited to teach a 3-day seminar on portability as it pertained to the C language. The seminar was to be offered in several major cities around the United States. As it happened, the series was cancelled, but by then I had put together a 70-page manuscript intended for use as handouts.

Ever since I came to the C fold, I have been fascinated by the apparent contradiction of C being both a low-level systems-implementation language yet, somehow, also being a portable one. And every time I heard someone speak or write enthusiastically about C's “inherent” portability, I became more uneasy with the observation that either I or a significant part of the C community was missing some major part of the “C picture.” As it happens, I don't think it's me although it does seem that a surprising amount of well-written C code can be ported relatively easily.

Given that I had a base portability document and an acute interest in the C phenomenon generally, and in the C Standard and portability in particular, I embarked on a formal and detailed look at C and portability. As I also make a substantial living from consulting in C and teaching introductory and advanced seminars about it added more weight to my decision to develop a serious manuscript for a 3-day portability seminar. Along the way, I decided the end result was worthy of becoming a book.

At first, I expected to produce about 200 book pages. Then it became 300 and 400, and I finally settled on 425, but only after I decided to cut quite a few appendices, purely for space reasons. As the amount and utility of the material left “on the editing room floor” is substantial, I am looking at ways to distribute that as well, perhaps through a future revision, or a companion volume. In any case, this book does not contain all my findings.

This book attempts to document C-specific issues you may encounter when porting existing code, or when writing code that is to be ported to numerous target environments. I use the term attempt because I don't believe this book provides all the answers, and in many cases, it does not even pretend to do so. For example, if you are porting from one flavor of UNIX to another, this book does not discuss any of the dark corners of that operating system. Nonetheless, I do believe it to be a credible beginning on which future works can be based. It is, as far as I can tell, the first widely published work of more than 20–30 pages that specifically addresses portability as it pertains to C. Because I do not claim to be well versed in more than 3–4 operating system and hardware environments, no doubt I have overlooked some relevant issues. Alternately, I may have overindulged in various esoteric aspects that may occur only in theory.

Whatever your interest in portability is, I hope this book provides some food for thought, even if only to help convince you that portability is not for you. If that is all this book achieves, it will have been wildly successful. If, on the other hand, it helps you define a porting strategy, or saves you going down a few wrong roads, then, too, I am happy. Whatever your opinion of this text, let me know since only by getting constructive criticism, outside input, and more personal experience can I improve it in future revisions or in a companion volume.

Anyone who has ever written a lengthy document that is to be read by more than a few people knows that after the first two or three reads, you no longer actually read what is written. You simply read what should be there. Therefore, you need technically competent reviewers who can provide constructive criticism. In this regard, the following people made significant contributions by proofing all or major parts of the manuscript: Steve Bartels, Don Bixler, Don Courtney, Dennis Deloria, John Hausman, Bryan Higgs, Gary Jeter, Tom MacDonald, and Sue Meloy. While I implemented many of their suggestions, space and time constraints prohibited me from capitalizing fully on their organizational and other suggestions. But as software vendors say, “We have to leave something to put in the next release.”

Others who have had more than a passing influence on my relatively short, but intense, C career are: P.J. Plauger, Standard C Secretary, ISO C Convener, and President of Whitesmiths Ltd, an international vendor of C and Pascal development tools; Tom Plum, Standard C Vice-Chair, Chairman of Plum Hall, and leading C author; Larry Rasler, formerly the Editor of the Draft Standard C Document, and AT&amp;T's principal member on the Standard C Committee (now of Hewlett-Packard); and Jim Brodie, an independent consultant (formerly of Motorola) who convened the C Standards Committee in mid-1983, and has so ably chaired it to its (hopefully) successful completion in late 1988 or thereabouts. Also, to my colleagues on the Standard C X3J11 Standards Committee, I say thanks for the opportunity to work with you all—without your papers, presentations, and sometimes volatile (pun intended) discussions both in and out of committee, the quality and quantity of material in this book would have been significantly reduced, perhaps to the point of not being sufficient enough for publication.

Rex Jaeschke

Edition 2, 2022

Fast-forward 32 years, and a lot has happened in the world of C. In particular,

 

C95, C99, C11, and C17 have been produced.

 

C++ has been standardized, and that standard has been revised several times.

 

16-bit systems are rare, and even 32-bit systems are less common. The mainstream world has moved to 64-bits.

 

C compilers that only support C prior to C89 are unlikely to be common, although code that initially went through them might still be in use.



This revision was the result of my estate planning during which I asked myself the question, “If I take no action, what will happen to my intellectual property when I die?” Presumably, it would be lost! As such, I looked around for a public venue in which to place it, where it could be read, and (hopefully responsibly) kept current.

Once I decided a revision was in order, I got quite ruthless. (I’m a great believer in Strunk and White’s advice, “Less is more!”) I removed all the material that was not related directly to portability. As a result, a great deal of the library-chapter content was cut. Back in 1988, the first C Standard was just about to debut, and there was little definitive text available about the library. As such, I included that in the first edition. However, that is no longer necessary. Also, one can purchase searchable electronic copies of the C (and C++) and related standards.

I made two important decisions regarding potential port targets:


 * To acknowledge that it is okay to want to port code, even if it is not, and never will be, Standard C-compliant!
 * To mention C++: C++ is widely used, and many programmers call C functions from C++, or put C code through a C++ compiler.

Of course, this edition will become outdated; as I write this, the C Standard’s committee is finalizing C23!

The first edition contained an annex that consisted primarily of lists of reserved identifiers in various orders. I chose to not include this annex for several reasons: A very large number of names has been added by the various standard revisions since C89, so it would have required a lot of effort to update the lists, and with C23 on the horizon, more work would be needed to revise that list yet again. In any event, the reviewers couldn’t agree on what form those lists should have to be easy to read while remaining useful.

Finally, thanks much to the reviewers of this edition: Rajan Bhakta, Jim Brodie, Doug Gwyn, David Keaton, Tom MacDonald, Robert Seacord, Fred Tydeman, and Willem Wakker.

Rex Jaeschke

= Future Revision of This Document =

There will be reasons to want to update this document, for example, to do the following:

 

Fix typographic or factual errors

 

Expand on a topic

 

Add details of specific porting scenarios and target hardware and operating systems

</li> <li>

Add incompatibilities between Standard C and Standard C++

</li> <li>

Cover future editions of the C and C++ standards

</li> <li>

Expand on issues relating to the headers added by C99 and later editions, especially those relating to floating-point

</li> <li>

Add issues regarding the optional IEC 60559 (IEEE 754) floating-point and complex support

</li> <li>

Add issues regarding the optional extended library

</li> <li>

Add instances of unspecified, undefined, implementation-defined, and locale-specific behaviors not already mentioned

</li> <li>

Flesh-out the “Intended Audience” section.

</li> <li>

Consider making available downloadable, lists of reserved identifiers, possibly organized by header and Standard edition.

</li></ul>

Regarding specific library functions, entries exist only for those having commentary relating to portability. If such commentary is to be added for a function that is not listed, an entry for it will have to be created first.

If you are adding to this book, please be precise and use the correct terminology, as defined by the C Standard. Only say things once, in the appropriate place, and then point to the definitive statement from other places, as necessary.

= Intended Audience =

Reviewer Willem Wakker wrote: I have flipped through the document and I think it is (of might be) a very useful document, although I am not too sure of the intended audience. Your introduction does not mention an intended audience, and an experienced C programmer probably think that she/he does not need this information ('I already know the nitty/gritty details because I am an experienced programmer').

Portability, like security, needs to be taken into account right from the start of a project, and at that early stage there is a need for more overall considerations regarding portability than all the (though useful!) details and pitfalls described in your book. This probably means that the book need to be on the radar of the more managerial type of people in a project who then can 'force' the programmers to take the good advice into account. And for those managers the book looks far too much like a technical guide, not something that they have to be concerned with. So, maybe, some introductory paragraphs about the concept of, and the need for portability written for the non-technical manager right at the start of the book might be a useful addition.

My response: For the time being, I’m adding this section as a placeholder. However, rather than try to write the content myself, I’ve decided to leave it to readers to flesh it out as they see fit once it gets published.

= Reader Assumptions and Advice =

This book does not attempt to teach introductory, or even advanced, C constructs. Nor is it a tutorial on Standard C. At times, some paragraphs might seem as terse as C itself. While I have attempted to soften such passages, I make no apologies for those that remain. Portability is not something a first time or trainee C programmer embarks on—quite the opposite.

The text is aimed specifically at the language-related aspects of porting C source code. However, it does not provide a recipe for successfully porting a system in any given set of target environments—it merely details many of the problems and situations you may encounter or may need to investigate. This book presumes that you are familiar with the basic constructs of the C language, such as all the operators, statements, and preprocessor directives, and that you are fluent with data and function pointers, and interfacing with the standard run-time library.

Because the C Standard, its accompanying Rationale document, and this text have the same basic organization, having a copy of each is advantageous, although not completely necessary, because the Standard can sometimes be challenging to read. However, the Rationale is much more leisurely paced and readable by the non-linguist. Note though that, having participated in the deliberations of the Standard Committee for 15 years (1984–1999), my vocabulary reflects that of the C Standard. Therefore, a copy of that document will prove especially useful.

Throughout the book, uses of “K&amp;R” refer to the first edition (1978) of Kernighan and Ritchie's book, The C Programming Language.

References to Standard C include all editions, and are used for core facilities that have been present since the first standard, C89. For a facility added in a specific edition, that edition number is used. C90 is not so used, as that is just an ISO repackaging of the ANSI Standard C89.

The history of the standardization of C is as follows:


 * C89 – The first C standard, ANSI X3.159-1989, was produced in 1989 by the U.S. committee X3J11.
 * C90 – The first ISO C standard, ISO/IEC 9899:1990, was produced in 1990 by committee ISO/IEC JTC 1/SC 22/WG 14 in conjunction with the US committee X3J11. C90 was technically equivalent to C89.
 * C95 – An amendment to C90 was produced in 1995 by committee WG 14 in conjunction with the U.S. committee X3J11. The term C95 means “C90 plus that amendment.”
 * C99 – The second edition of the ISO C standard, ISO/IEC 9899:1999, was produced by committee WG14 in conjunction with the U.S. committee INCITS/J11 (formerly X3J11).
 * C11 – The third edition of the ISO C standard, ISO/IEC 9899:2011, was produced by committee WG14 in conjunction with the U.S. committee INCITS/PL22.11 (formerly INCITS/J11).
 * C17 – The fourth edition of the ISO C standard, published the following year as ISO/IEC 9899:2018, was produced by committee WG14 in conjunction with the U.S. committee INCITS/PL22.11. This was a maintenance release that included corrections to the standard based on Defect Reports. No new functionality was added.
 * C23 – Planned release of the fifth edition of the ISO C standard.

Some paragraphs are tagged “C++ Consideration.” C++ is widely used, and many programmers call C functions from C++, or put C code through a C++ compiler. However, C++ is not a superset of C, so it is worth understanding the incompatibilities. A common saying in the C++ standard’s community is “As close as possible to Standard C, but no closer!”

Numerous references to acronyms, abbreviations, and specialized terms are made throughout the book. Most are in common use in the C community today; however, a few that relate directly to portability are shown here (with their definitions taken verbatim from C17):

<ul> <li>

Unspecified behavior – Behavior that results from the use of an unspecified value, or other behavior upon which this document provides two or more possibilities and imposes no further requirements on which is chosen in any instance.

</li> <li>

Undefined behavior – Behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirements.

</li> <li>

Implementation-defined behavior – Unspecified behavior where each implementation documents how the choice is made.

</li> <li>

Locale-specific behavior – Behavior that depends on local conventions of nationality, culture, and language that each implementation documents.

</li></ul>

The C Standard contains a more complete list of definitions and, in particular, discusses the criteria for conformance of programs and implementations.

Although this book contains many instances of the four behaviors defined above, it does not contain all of them. The complete list is contained in the “Portability issues” annex of the C Standard.

While a conforming implementation is required to document implementation-defined behavior, the term “implementation-dependent” is used in this book to refer to some characteristic of an implementation that is not required by Standard C to be documented.

C89 stated, “Certain features are obsolescent, which means that they may be considered for withdrawal in future revisions of the Standard. They are retained in the Standard because of their widespread use, but their use in new implementations (for implementation features) or new programs (for language or library features) is discouraged.” Some editions of Standard C declare certain features obsolescent by deprecating them. According to Wiktionary, to deprecate means “To declare something obsolescent; to recommend against a function, technique, command, etc. that still works but has been replaced.”

From very early in its life, the committee that standardizes C has had a charter, which it has followed (and revised over time). Several items from the original charter are worth mentioning here:

'''Item 2. C code can be portable.''' Although the C language was originally born with the UNIX operating system on the DEC PDP-11, it has since been implemented on a wide variety of computers and operating systems. It has also seen considerable use in cross-compilation of code for embedded systems to be executed in a free-standing environment. The Committee has attempted to specify the language and the library to be as widely implementable as possible, while recognizing that a system must meet certain minimum criteria to be considered a viable host or target for the language.

'''Item 3. C code can be non-portable.''' Although it strove to give programmers the opportunity to write truly portable programs, the Committee did not want to force programmers into writing portably, to preclude the use of C as a “high-level assembler;” the ability to write machine-specific code is one of the strengths of C. It is this principle which largely motivates drawing the distinction between strictly conforming program and conforming program.

= Introduction =

Defining Portability
According to The Prentice-Hall Standard Glossary of Computer Terminology by Robert A. Edmunds, portability is defined as follows: “Portability: A term related to compatibility. Portability determines the degree to which a program or other software can be moved from one computer system to another.” The key phrase here is “the degree to which a program can be moved.”

From Wikipedia, “In software engineering, porting is the process of adapting software for the purpose of achieving some form of execution in a computing environment that is different from the one that a given program (meant for such execution) was originally designed for (e.g., different CPU, operating system, or third party library). The term is also used when software/hardware is changed to make them usable in different environments. Software is portable when the cost of porting it to a new platform is significantly less than the cost of writing it from scratch. The lower the cost of porting software relative to its implementation cost, the more portable it is said to be.”

We can talk about portability from two points of view: generic and specific. Generically, portability simply means running a program in one or more environments that are somehow different from the one(s) for which it was designed. Because the cost of producing and maintaining software far outweighs that for producing hardware, we have a huge incentive to increase the shelf life of our software beyond the current incarnations of hardware. It simply makes economic sense to do so.

Specific portability involves identifying the individual target environments in which a given program must execute, and clearly stating how those environments differ. Examples of porting scenarios include:

<ul> <li>

Moving from one operating system to another on the same machine.

</li> <li>

Moving from a version of an operating system on one machine to the same operating system version on another machine with a different architecture.

</li> <li>

Moving between variants of the same operating system (such as various flavors of UNIX and Linux) on different machines.

</li> <li>

Moving between two entirely different hardware and operating system environments.

</li> <li>

Moving between systems using different floating-point hardware or emulation software.

</li> <li>

Moving between different compilers on the same system.

</li> <li>

Moving between a Standard C-compliant implementation to one that is not, and vice versa.

</li> <li>

Recompiling code on the same compiler, but using different compiler options.

</li> <li>

Moving from one version of a compiler to another version of the same compiler on the same system.

</li></ul>

The last two scenarios might not be obvious. However, it is possible to encounter problems when taking existing code that compiles without error, runs, and does the job, and running it through a new version of the same compiler or simply with different compile-time options. One reason for potentially unexpected behavior is when implementation-defined behavior changes (such as the signedness of a plain ). Another might be the previous reliance on undefined behavior that just happened to do what the programmer expected (such as the order of evaluations of some expressions).

Note that it is okay to port code between systems that are not Standard C-compliant! For example, early Digital Signal Processing (DSP) chips supported only 32-bit floating-point data and operations, in which case, the types,  , and     (if the latter two are even supported by the compiler), are mapped to 32-bits. In such cases, meaningful applications can still be ported among members of a DSP-chip family.

Porting is not simply getting a piece of software to work on multiple targets. It also involves doing so with a reasonable (and affordable) amount of resources, in a timely manner, and in such a way that the resulting code will perform adequately. There is little point in porting a system to a target such that when the port is complete, it runs so slowly or uses so many system resources that it is rendered unusable.

Important questions to ask yourself are:

<ul> <li>

Am I porting to or from a Standard C implementation? If so, which standard versions are supported?

</li> <li>

Am I porting code that was designed and written with portability in mind?

</li> <li>

Do I know what all the environments will be up front and how many of them I will actually have available for testing?

</li> <li>

What are my performance requirements regarding speed, memory, and disk efficiency?

</li></ul>

There is another, important porting scenario, of compiling with a C++ compiler. Even if such ported code does not take advantage of C++’s features, certain extra checking will be done. For example, C++ requires C’s prototype style of function declaration and definition. And, over time, new code that does use C++’s features can be added, or the C code could be called by existing C++ functions. Note that there is not just one C++ standard; so far, we’ve had C++99, C++03, C++11, C++14, C++17, and C++20.

Portability is Not New
Along with the wide availability of good and cheap C compilers and development tools in the early 1980s, the idea of software portability became popular. So much so, that, to hear some people talk, portability became possible because of C.

The notion of portability is much older than C, and that software was being successfully ported long before C became an idea in Dennis Ritchie's head. In 1959, a small group defined a standard business language called COBOL, and in 1960, two vendors (Remington Rand and RCA) implemented compilers for that language. In December of that year, they conducted an experiment where they exchanged COBOL programs, and according to Jean E. Sammet, a member of the COBOL design team, “… with only a minimum number of modifications primarily due to differences in implementation, the programs were run on both machines.” Regarding COBOL's development of a description for data that is logically machine independent, Sammet wrote in 1969, “[COBOL] does not simultaneously preserve efficiency and compatibility across machines.”

Fortran was also an early player in the portability arena. According to Wikipedia, “… the increasing popularity of FORTRAN spurred competing computer manufacturers to provide FORTRAN compilers for their machines, so that by 1963 over 40 FORTRAN compilers existed. For these reasons, FORTRAN is considered to be the first widely used cross-platform programming language.”

Given that a program was written in C provides no indication whatsoever as to the effort required to port it. The task may be trivial, difficult, impossible, or uneconomical. Given that a program was written in a language without regard to the possibility of its being ported to some different environment, the ease with which it may actually be ported to that environment probably depends as much on the discipline and idiosyncrasies of its author, as the language itself.

Designing a program to be portable over a range of environments, some of which may not yet be defined, may be difficult, but it is not impossible. It just requires considerable discipline and planning. It requires understanding and controlling (and eliminating where reasonable) the use of features which may give unacceptably different results in your expected different environments. This understanding helps you avoid knowingly (or more likely, unknowingly) counting on non-portable features or characteristics of the program you are writing. In addition, frequently one of the key aims in such a project is not to write a program that will run on any system without modification, but to isolate environment-specific functions, so they may be rewritten for new systems. The major portability considerations are much the same for any language. Only the specific implementation details are determined by the language used.

The Economics of Portability
Two main requirements for being successful at porting are: having the necessary technical expertise and tools for the job, and having management support as well as approval. That said, it must be acknowledged that many projects are implemented by individuals or a small group with no management, yet portability is still desired

Clearly, one needs to have, or be able to get and keep, good C programmers. The term “good” does not imply guru status alone, or at all, because such staff can often have egos that are difficult to manage. And perhaps the most important attribute required in a successful porting project is discipline, both at the individual and at the group levels.

The issue of management support is often more important, yet it is largely ignored both by the developers and by management itself. Consider the following scenario. Adequate hardware and software is provided for all (or a representative subset) of the specified target environments and the development group religiously runs all its code through all targets at least weekly. Often, it submits a test stream in a batch job every evening.

Six months into the project, management reviews progress and finds that the project is taking more resources than anticipated (doesn't it always?) and decides to narrow the set of targets, at least on a temporary basis. That is, “We have to have something tangible to demonstrate at tradeshows because we have already announced the product” or “The venture capitalists are expecting to see a prototype at the next board meeting.” Whatever the reasons, testing on, and development specifically for, certain targets is suspended, often permanently.

From that point on, the development group must ignore the idiosyncrasies of the dropped machines because they are no longer part of the project, and the company cannot afford the extra resources to consider them seriously. Of course, management's suggestion is typically, “While we don't want you to go out of your way to support the dropped environments, it would be nice if you don't do anything to make it impossible or inefficient for us to pick them up again at some later date.”

Of course, as the project slips even further, competitors announce and/or ship alternative products, or the company falls on hard economic times, other targets may also be dropped, possibly leaving only one because that is all that development and marketing can support. And each time it drops a target, the development group starts to cut corners because it no longer has to worry about those other hardware and/or operating system targets. Ultimately, this decreases the chances of ever starting up on dropped targets at some later date as all code designed and written because support for those targets was dropped needs to be inspected (assuming, of course, that this code can even be identified) to ascertain the effort required, and the impact on resuming supporting that target. You may well find that certain design decisions that were made either prohibit or negatively impact reactivating the abandoned project(s).

The end result often is that the product is initially delivered for one target only and is never made available in any other environment. Another situation is to deliver for one target, and then go back and salvage “as much as possible” for one or more other targets. In such cases, the task may be no different from one in which you are porting code that was never designed with portability in mind.

Measuring Portability
How do you know when or if a system has been successfully ported? Is it when the code compiles and links without error? Do results have to be identical? If not, what is close enough? What test cases are sufficient to demonstrate success? In all but the most trivial of cases, you will not be able to test exhaustively/completely every possible input/situation.

Certainly, the code must compile and link without error, but because of implementation-defined behavior, it may be quite possible to get different results from different targets. The legitimate results may even be sufficiently different as to render them useless. For example, floating-point range and precision may vary considerably from one target to the next such that results produced by the most limited floating-point environment are not precise enough. Of course, this is a design question and should be considered well before the system is ported.

A general misconception is that exactly the same source code files must be used on all targets such that the files are full of conditionally compiled lines. This need not be the case at all. Certainly, you might require custom headers for some targets. You might also require system-specific code written in C, and possibly in assembler or other languages. Provided such code is isolated in separate modules and the contents of and interfaces to such modules are well documented, this approach need not be a problem.

If you are using the same data files across multiple targets, you will need to ensure that the data is ported correctly, particularly if it is stored in binary rather than text format, and endian differences are involved. If you do not, you may waste considerable resources looking for non-existent code bugs.

Unless you have adequately defined what your specific portability scenario and requirements are, you cannot tell when you have achieved it. And by definition, if you achieve it, you must be satisfied. If you are not, either your requirements have changed, or your design was flawed. And most importantly, successfully porting a program to some environments is not a reliable indication of the work involved in porting it to yet another target.

Environmental Issues
As pointed out in other sections, some portability issues have little or nothing to do with the implementation language. Rather, such issues are relevant to the hardware and operating system environments on which the program must execute. Some of these issues are hinted at in the main body of this book; they are summarized here, as follows:

<ul> <li>

Mixed-language environments. Certain requirements may be placed on C code that is to call, or be called by, some other language processor.

</li> <li>

Command-line processing. Not only do different command-line processors vary widely in their behavior, but the equivalent of a command-line processor may not even exist for some of your targets.

</li> <li>

Data representation. This is, of course, completely implementation-defined and may vary widely. Not only can the size of an  differ across your targets, but you are not even guaranteed that all bits allocated to an object are used to represent the value of that object. Another significant problem is the ordering of bytes within words and words within long words. Such encoding schemes are referred to as big-endian or little-endian.

</li> <li>

CPU speed. It is a common practice to assume that executing an empty loop n times in a given environment causes a pause of 5 seconds, for example. However, running the same program on a faster or slower machine will invalidate this approach. (The same is true when running it on versions of the same processor having different clock speeds.) Or perhaps the timing is slightly different when more (or fewer) programs are running on the same system. Related issues include the frequency and efficiency of handling timer interrupts, both hardware and software.

</li> <li>

Operating system. If even present (free-standing C does not require an operating system), the principal issues are single- versus multi-tasking and fixed- versus virtual-memory organization. Other issues involve the ability to field synchronous and asynchronous interrupts, the existence of reentrant code, and shared memory. Seemingly simple tasks such as getting the system date and time may be impossible on some systems. Certainly, the granularity of system time measurement varies widely.

</li> <li>

File systems. Whether multiple versions of the same file can coexist or whether the date and time of creation or last modification are stored, is implementation-dependent. Likewise for the character set permitted in file names, length of the names, and whether or not such names are case-sensitive. And as for device and directory-naming conventions, the variations are as broad as their inventor's imaginations. Consequently, the C Standard says nothing about file systems except for sequential files being accessed by a single user.

</li> <li>

Development support tools. These tools may have a significant effect on the way you write, or are required to write, code for a given system. They include the C translator, linker, object and source librarian, assembler, source-code management system, macro preprocessors, and utility libraries. Examples of restrictions include the casing, significance, and number of external identifiers, perhaps even the size of each object module or the number and size of source modules. Perhaps the overlay linker has significant restrictions on the complexity of the overlay scheme.

</li> <li>

Cross-compilation. In environments where the target is not the system on which the software is being developed, differences in character sets, arithmetic representations, and endianness become important.

</li> <li>

Screen and keyboard devices. The protocols used by these vary widely. While many implement some or all of various ANSI standards, just as many do not, or contain incompatible extensions. Getting a character from the standard input without echoing it, or without needing to press the return or enter key as well, might not be universally possible. The same is true for direct cursor addressing, graphics display, and input devices such as light pens, track balls, and mice.

</li> <li>

Other peripheral interfaces. Your design may call for interactions with printers, plotters, scanners, and modems, among other pieces of equipment. While some de facto standards may exist for each, you may be forced, for one reason or another, to adopt “slightly” incompatible pieces.

</li></ul>

Programmer Portability
In all the discussions on portability, we continually refer to the aspect of moving code from one environment to another. And while this is an important consideration, it is more likely that C programmers will move to a different environment more often than the software they write. For this reason, the author has coined the term programmer portability.

Programmer portability can be defined as the ease with which a C programmer can move from one environment to another. This is an issue important to any C project, not just one that involves code portability. If you adopt certain programming strategies and styles, you can make it much easier and quicker to integrate new team members into the project. Note though that, while you may have formulated a powerful approach, if it is too far from the mainstream C practice, it will either be difficult and/or expensive to teach or to convince other C programmers of its merits.

= The Environment =

When a C program is written, consider two primary environments—that in which it is compiled (that is, translated) and that in which it is executed. For the vast majority of C programs, these two environments are likely to be one and the same. However, C is used in an increasing number of situations where the execution environment has properties different from that of the translation environment.

Translation Phases
Prior to C89, C compilers varied in the way in which they recognized and processed tokens. To nail down the order in which source tokens should be processed, Standard C explicitly identifies a set of rules collectively known as the phases of translation. These rules break programs that previously relied on a different order of translation.

Recommendation: Read and understand Standard C’s phases of translation, so you can see if your implementation follows them.

Standard C does not require the preprocessor to be a separate/stand-alone program, although it permits it. For the most part, the preprocessor is permitted to work without knowing the specific properties of the target implementation. (One exception is that as Standard C requires preprocessing arithmetic expressions to be computed using a given type; see  Arithmetic.)

Diagnostics
Standard C defines the circumstances in which a conforming implementation is required to issue a diagnostic. The form of the diagnostic is implementation-defined. The Standard makes no statement about information or warning messages such as “Variable  used before being initialized” and “Unreachable code.” These are considered to be quality of implementation issues best left for the marketplace to decide.

Standard C allows extensions provided they do not render a strictly conforming program invalid. A conforming compiler must be able to disable (or diagnose) extensions. Extensions to a conforming compiler are limited to assigning meaning to syntax not given semantics by Standard C, or defining the meaning of otherwise undefined or unspecified behaviors.

Execution Environments
Standard C defines two kinds of execution environments: freestanding and hosted. In both cases, program startup occurs when a designated C function is called by the execution environment.

The manner and timing of static initialization is unspecified. However, all objects in static storage must be initialized before program startup. For hosted environments, the designated C function is typically called, although it need not be. With Standard C, function  is called at program startup. A program is not strictly conforming if an entry point other than  is used.

Recommendation: For hosted applications, always use  as the program's entry point unless you have a particularly good reason for not doing so, and you make sure you adequately document it.

Program termination is the return of control to the execution environment

Freestanding Environment
A freestanding environment runs without the benefit of an operating system, and, as a result, program execution can begin in any manner desired. Although such application environments are somewhat nonportable by definition, much of their code can often be ported (to an upwards-compatible series of device controllers, for example) if designed properly. Even writers of embedded systems need to port to new and different environments.

The name and type of the function called at program startup is implementation-defined as is the method of program termination.

The library facilities (if any) available to a freestanding program are implementation-defined. However, Standard C requires the headers,  ,  ,  ,  ,  ,  ,  , and.

Hosted Environment
Standard C permits  to have either none or two arguments, as follows:

(Of course,  could instead be declared   , and the parameter names   and   are arbitrary.)

A common extension is that the function  receives a third argument,   , in which   leads to a null pointer-terminated array of pointers to  , each of which points to a string that provides certain information about the environment for this execution of the process. Any program that defines more than two arguments is not strictly conforming. That is, it is not maximally portable.

Recommendation: Use the library function  instead of the   parameter in   to access environment variables. Note, however, that the format of the string returned by, and the set of environment variables, is implementation-defined.

Some user manuals and books erroneously suggest  be defined as having type   (or some other type) instead of   because many programs rarely, if ever, explicitly   from   (with or without a return value).

Recommendation: Always define function  as having   type and return an appropriate exit code.

Standard C requires that  be nonnegative. Traditionally,  is at least one even if   is set to point to an empty string.

Recommendation: Do not assume that  is always greater than zero Standard C permits it to be zero.

Standard C requires that  contain the null pointer. This means that the  array contains       elements, not. This allows the  pointer array to be processed without regard to the value of.

Standard C makes no comment about the handling of quoted literals on command lines. Therefore, the ability to handle quoted strings at all, or those containing embedded white space, is implementation-dependent. If the host environment cannot handle command-line arguments containing both alphabetic cases, it must supply text arguments in lowercase.

Recommendation: Make no assumptions about special handling of quoted literals in command-line processing. Such quotes may delimit strings, or they may be considered part of the string in which case,  would result in two arguments   and. The casing of letters might not be preserved, even in the presence of quotes. (Use  (or  ) with command-line arguments before comparing them against a list of valid strings.) Even if quotes are recognized, the method of escaping a quote (so it can be passed in an argument) may vary. Standard C doesn't even require that a command-line environment exist.

A primary use of command-line arguments is to specify switches that determine the kind of processing to be done by the program being invoked. In a text-processing utility, for example, you may wish to use multi-word switches. In this case, connect the words using an underscore as follows:

and ignore case during switch processing. With care, you can design a command-line argument syntax that is extremely portable. Take care though that you don't need a larger command-line buffer than a system can support. If a program can potentially have many and/or long arguments, you should put them in a configuration file and pass its name as a command-line argument. For example,

allows an unlimited number of arguments to be processed regardless of command-line buffer size.

According to Standard C,  represents the “program name” (whatever that may translate to for a given implementation.) If this is not available,   must point to an empty string. (Some systems that cannot determine the program's name have  point to a string such as   or  .)

Recommendation: Don't assume the program's name is available. Even if  points to the program's name, the name may be as it was specified on the command-line (possibly with case conversion) or it may be the translated name the operating system used to actually locate and load the program. (Full name translation is often useful if you wish to parse the string pointed to by  to determine certain disk and directory information.)

Standard C requires that the parameters  and , and the strings pointed to by  , be modifiable by the user program and that these may not be changed by the implementation while the user program is executing.

Numerous environments support the command-line operators,  , and. In such systems, these characters (and the filenames that accompany them) are handled by the command-line processor (and removed) before it passes off the remaining command-line to the execution environment. Systems that do not handle these operators in such a manner pass them through to the execution environment as part of the command-line where they can be handled or passed through to the application program. Such operators are outside the scope of Standard C.

The above-mentioned operators typically allow redirection of  and. Some systems allow  to be redirected. Some systems consider  to be the same as.

Recommendation: Don't assume universal support for the command-line redirection operators , , and . Redirection of the standard files may be possible from within a program via the  library function.

Recommendation: Write error messages to  rather than   even if both file pointers are treated as the same. This way, you can take advantage of systems that do allow  and   to be independently redirected.

The method used to invoke  during program startup can vary. Standard C requires that it be done as if the following code were used:

in which case, any value returned from  will be passed on as the program's exit code.

Dropping through the closing brace of  results in an exit code of zero.

Some implementations may restrict exit codes to unsigned integral values or to those values that fit into a byte. Refer to the library function  for more details. Also, although some systems interpret an exit code of 0 as success, others may not. Standard C requires that 0 mean “success.” It also provides the implementation-defined macros  and   in.

Recommendation: The range of values, meaning, and format of exit codes is implementation-defined. Even though  returns an   argument, that argument may be modified, truncated, etc., by the termination code before being handed to the host system. Use  rather than 0 to indicate a success exit code.

If you are using exit codes to return information from one user program to its parent user program, you are typically free to adopt your own value conventions because the host environment probably won't be processing the exit code directly.

Program Execution
Standard C goes to some lengths to define an abstract machine. At certain specified points in the execution sequence called sequence points, all side-effects of previous evaluations shall be complete, and no side-effects of subsequent evaluations shall have taken place.

One particular problem has been the handling of terminal input and output where some implementations have used buffered I/O while others used unbuffered I/O.

An optimizing compiler is permitted to optimize across sequence points provided it can guarantee the same result as if it had followed them rigorously.

C11 added support for multiple threads of execution. Previously, multi-threaded programs used library functions and/or compiler extensions.

Character Sets
A C program is concerned with two possible character sets: source and execution. The source character set is used to represent the source code program, and the execution character set is available at run time. Most programs execute on the same machine on which they are translated, in which case their source and execution character sets are the same. Cross-compiled programs generally run on a different machine than that used for their development, in which case the source and execution sets might be different.

The characters in the source character set, except as explicitly specified by Standard C, are implementation-defined. The characters in the execution character set (except for the  character) and their values are implementation-defined. The execution character  must be represented by all-bits zero.

The meaning of an unspecified character in the source text, except in a character constant, a string literal, or a comment, is implementation-defined.

While many C programs are translated and execute in an ASCII (and now Unicode) environment, other character sets are in use. As the set of upper- and/or lowercase letters may not be contiguous (such as in with EBCDIC), care must be taken when writing routines that handle multiple character sets. It is also possible when dealing with non-English letters that they do not have a corresponding upper- or lowercase equivalent. The collating sequence of character sets is also important when using the library function.

Recommendation: If you write code that is specific to a particular character set, either conditionally compile it based on the host character set or document it as being an implementation-specific module. Use the  functions rather than testing characters against a specific set or range of integers.

Trigraph Sequences
In certain environments, some of the required source characters are not available to programmers. This is typically because they are using a machine with a character set that does not include all the necessary punctuation characters. (It may also be because they are using a keyboard that does not have keys for all the necessary punctuation characters.)

To enable the input of characters that are not defined in the ISO 646–1983 Invariant Code Set (which is a subset of the seven-bit ASCII code set), the following trigraph sequences were introduced by C89:

A trigraph is a token consisting of three characters, the first two of which are . The three characters collectively are taken to represent the corresponding character in the table above.

The addition of support for trigraphs in a compiler may change the way existing character constants or string literals are interpreted. For example,

will be treated as if it had been written as

and  will be two, not four.

If such literal strings are intended to be displayed, then the impact of moving to a system supporting trigraphs from one that doesn't will be minimal and overt—the user will see a slightly different output. However, if the program parses a string expecting to find a specific character, such as , it will no longer find it if it has been previously interpreted as part of a trigraph sequence.

Even though the vast majority of C programmers likely will have no use for trigraphs, a conforming implementation is required to support them. Therefore, you need to be aware of their existence so you can understand why seemingly innocent strings are being “misinterpreted.”

Recommendation: Use a search program to check if sequences of  occur in existing source. If they do occur in more than a few places, you may wish to search specifically for the trigraph sequences.

Recommendation: To preserve sequences that look like trigraphs but are not intended to be, use the Standard C escape sequence  to force a literal ? character in a literal string or single-character constant. For example,  is four as is.

Recommendation: If your implementation doesn't support trigraphs, you can protect against them in the future by using the  sequence now because the backslash is supposed to be ignored if it does not begin a recognized escape sequence.

While some compilers recognize trigraphs, other implementations require the use of a standalone tool to convert code containing trigraphs to code without them.

C95 added digraphs as a mechanism to allow sometimes-unavailable source tokens to have alternate spellings (see Source Tokens). Unlike trigraphs, digraphs are tokens, so they can’t be recognized inside another token, such as a character constant or string literal.

Multibyte characters
C89 introduced the notion of a multibyte character. Certain aspects of the handling of such characters are locale specific. Prior to that, some implementations used double-byte and other approaches to dealing with extended characters.

Character Display Semantics
The handling of certain escape sequences in Standard C involves locale specific or unspecified behavior.

C89 defined the escape sequences  and .

Some systems treat  as a carriage-return and a new-line, while others treat it as just a new-line.

Signals and Interrupts
Standard C places certain restrictions on the kinds of objects that can be modified by signal handlers. With the exception of the  function, the Standard C library functions are not guaranteed to be reentrant and they are permitted to modify static data objects.

Environmental Limits
There are a number of environmental constraints on a conforming implementation, as discussed below.

Translation Limits
As of C17, Standard C requires that “The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:

<ul> <li>

127 nesting levels of blocks

</li> <li>

63 nesting levels of conditional inclusion

</li> <li>

12 pointer, array, and function declarators (in any combinations) modifying an arithmetic, structure, union, or void type in a declaration

</li> <li>

63 nesting levels of parenthesized declarators within a full declarator

</li> <li>

63 nesting levels of parenthesized expressions within a full expression

</li> <li>

63 significant initial characters in an internal identifier or a macro name(each universal character name or extended source character is considered a single character)

</li> <li>

31 significant initial characters in an external identifier (each universal character name specifying a short identifier of 0000FFFF or less is considered 6 characters, each universal character name specifying a short identifier of 00010000 or more is considered 10 characters, and each extended source character is considered the same number of characters as the corresponding universal character name, if any)

</li> <li>

4095 external identifiers in one translation unit

</li> <li>

511 identifiers with block scope declared in one block

</li> <li>

4095 macro identifiers simultaneously defined in one preprocessing translation unit

</li> <li>

127 parameters in one function definition

</li> <li>

127 arguments in one function call

</li> <li>

127 parameters in one macro definition

</li> <li>

127 arguments in one macro invocation

</li> <li>

4095 characters in a logical source line

</li> <li>

4095 characters in a string literal (after concatenation)

</li> <li>

65535 bytes in an object (in a hosted environment only)

</li> <li>

15 nesting levels for d files

</li> <li>

1023 case labels for a switch statement (excluding those for any nested switch statements)

</li> <li>

1023 members in a single structure or union

</li> <li>

1023 enumeration constants in a single enumeration

</li> <li>

63 levels of nested structure or union definitions in a single struct-declaration-list”

</li></ul>

These numbers are somewhat misleading. In effect, Standard C does not guarantee any specific support for all combinations of limits.

Numerical Limits
A conforming implementation must document these limits via a series of macros defined in the headers  and. Additional limits are specified in, which was added by C99.

Starting with C99, the existence of the optional predefined macro  indicates support for the IEC 60559 floating-point standard, as described in an annex of the C Standard.

Starting with C99, the absence of the optional predefined macro  indicates support for complex types and their associated arithmetic. Furthermore, the existence of the optional predefined macro  indicates that complex support conforms to IEC 60559, as described in an annex of the C Standard.

See also  and.

= Lexical Elements =

Source Tokens
Standard C requires that when source input is parsed into tokens, the longest possible valid token sequence must be formed. There must be no ambiguity as to what a particular construct means. For example, the text  must generate a syntax error because the tokens found are ,  ,  ,  , and  , and the (postfix) second   operator has an operand that is not an lvalue. Note that  is valid, as the white space causes the tokens to be parsed as ,  ,  ,  , and . Likewise, for.

Archaic: Prior to C89, some preprocessors allowed tokens to be created from other tokens. For example:

The intent here is that the macro expands to the single token  rather than the two tokens   and. It relies on the non-Standard C approach of replacing the comment in the macro definition with nothing, rather than a single space. Standard C added the preprocessor token-pasting operator,, as a portable solution for achieving the desired behavior.

Prior to C89, some preprocessors allowed string literal tokens to be created during preprocessing. Standard C added the preprocessor stringize operator,, as a portable solution for achieving the desired behavior.

Recommendation: Avoid exploiting idiosyncrasies of preprocessors that follow tokenizing rules different from those defined by Standard C.

Keywords
The following tokens are defined as keywords by Standard C:

Archaic: Although  and   were not defined in K&amp;R, they were supported by various compilers prior to C89.

Standard C does not define or reserve the keyword  previously reserved in K&amp;R and by some older C compilers.

C++ Consideration: Standard C++ does not define or reserve the keyword. Nor does it define those keywords beginning with underscore and an uppercase letter. (However, for some, it provides alternate spellings, such as,  ,  , and  .)

Many compilers support extended keywords, some starting with one or two underscores, or with names in the programmer identifier space.

Spelling
K&amp;R and C89 allowed underscores, English upper- and lowercase letters, and decimal digits.

The set of external names allowed by an (older) environment might not include underscore and might not be case-sensitive, in which case, some characters in external names might be mapped to something else.

C99 added the predefined identifier. C99 also added support for Universal character names in identifiers (see Universal character names), as well as any number of implementation-defined extended characters.

C++ Consideration: The following tokens are defined as keywords by Standard C++:

Some of these names are defined as macros in Standard C (such as  in  ). These are discussed elsewhere.

C++ Consideration: Standard C++ gives special meaning to the following identifiers:,  ,  , and.

Recommendation: If there is a possibility that your C code will be run through a C++ compiler, avoid using identifiers that Standard C++ defines as keywords or identifiers with special meaning.

C++ Consideration: According to Standard C++: “Each identifier that contains a double underscore __ or begins with an underscore followed by an uppercase letter is reserved to the implementation for any use” and “Each identifier that begins with an underscore is reserved to the implementation for use as a name in the global namespace.”

Length and Significant-Character Limits
While Standard C places no maximum length on an identifier, the number of characters treated as significant, might be limited. Specifically, the length limit for an external name may be more restrictive than that for an internal name (typically due to linker considerations). The number of significant characters in an identifier is implementation-defined. Standard C requires implementations to distinguish at least the first 31 characters in an external identifier, and the first 63 in an internal identifier.

Name Spaces
K&amp;R defined two disjoint categories of identifiers: those associated with ordinary variables, and structure and union members and tags.

Standard C added several new categories of identifier name space. The complete set is labels; structure, union, and enumeration tags; the members of structures and unions (with each structure and union having its own name space); and all other identifiers, called ordinary identifiers.

The identifiers optionally allowed in Standard C function prototypes have their own name space. Their scope is from their name through the end of that prototype declaration. Therefore, the same identifier can be used in different prototypes, but it cannot be used twice in the same prototype.

K&amp;R contained the statement, “Two structures may share a common initial sequence of members; that is, the same member may appear in two different structures if it has the same type in both and if all previous members are the same in both. (Actually, the compiler checks only that a name in two different structures has the same type and offset in both, but if preceding members differ the construction is nonportable.)” Standard C eliminated this restriction by endorsing the separate member-per-structure name space.

Universal Character Names
C99 added support for universal character names. They have the form  and \, where   is a hexadecimal digit. They may appear in identifiers, character constants, and string literals.

Constants
Standard C requires that an implementation use at least as much precision as is available in the target execution environment when handling constant expressions. It may use more precision.

Integer Constants
C89 provided the suffix  (and  ) to support unsigned constants. These suffixes can be used with decimal, octal, and hexadecimal constants. constants may be suffixed with  (or  ).

C99 added the type    , and constants of that type may be suffixed with   (or  ). C99 also added the type      , and constants of that type may be suffixed with   (or  ).

K&amp;R permitted octal constants to contain the digits 8 and 9 (which have octal value 10 and 11, respectively). Standard C does not allow these digits in octal constants.

The type of an integral constant depends on its magnitude, its radix, and the presence of optional suffix characters. This can cause problems. For example, consider a machine on which an  is 16 bits, and twos-complement representation is used. The smallest  value is -32768. However, the type of the expression  is , not  ! There is no such thing as a negative integer constant; instead, we have two tokens: the integer constant 32768 and the unary minus operator. As 32768 is too big to fit in 16 bits, it has type, and the value is negated. As such, having the function call  without a function prototype in scope might cause an argument/parameter mismatch. If you look at the definitions of  in   for an implementation on such a machine, you likely will find something like the following:

This satisfies the requirement that that macro have the type.

Regarding radix, on this 16-bit machine, 0xFFFF has type    while -32768 has type.

A similar situation occurs with the smallest value for a 32-bit, twos-complement integer, -2147483648, which might have type  or     instead of , depending on type mapping.

Recommendation: Explicitly type integral constants (or cast them) when their type is important (e.g., as function call arguments and with the  operator).

A similar problem exists when passing a zero constant to a function expecting a pointer—intending this to mean “null pointer”—but no function prototype is in scope, as in . The type of zero is, whose size/format might not match the parameter’s pointer type. Besides, for machines having pointers that do not look like integers, no implicit conversion is done to compensate for this.

The correct thing to do is to use the  library macro, which is most often defined using one of the following:

Archaic: Prior to C89, different compilers used different rules for typing integer constants. K&amp;R required the following: “A decimal constant whose value exceeds the largest signed machine integer is taken to be ; an octal or hexadecimal constant which exceeds the largest unsigned machine integer is likewise taken to be  .”

Standard C requires the following rules for typing integer constants: “The type of an integer constant is the first of the corresponding list in which its value can be represented. Unsuffixed decimal:,    ,      ; unsuffixed octal or hexadecimal:  ,    ,    ,      ; suffixed by the letter   (or  ):  ,  ; suffixed by the letter   (or  ):  ,      ; suffixed by both   (or  ) and   (or  ):      .” C99 added steps for     and.

Some compilers support integer constants expressed in binary (base-2, that is); others allow separators (such as underscore) between digits for all bases. These features are not part of Standard C.

Integer constants beginning with  are considered to be octal. The #line preprocessing directive has the form

digit-sequence new-line

Note carefully that the syntax does not involve integer-constant. Instead, digit-sequence is interpreted as a decimal integer even if it has one or more leading zeros!

Floating Constants
The default type of a floating-point constant is. C89 added support for the type    along with the floating constant suffix   (or  ) for   constants, and   (or  ), for     constants.

Recommendation: Explicitly type floating-point constants (or cast them) when their type is important (e.g., as function call arguments and with the  operator).

C99 added support for writing floating constants using hexadecimal notation.

C99 also added the macro  (in  ), whose value might allow a floating constant to be evaluated to a format whose range and precision is greater than required. For example, a compiler has the freedom to (quietly) treat  as   or even , instead.

Enumeration Constants
The names of the values defined for an enumeration are integer constants, and Standard C defines them to be s.

K&amp;R did not include enumerations.

C++ Consideration: An enumeration constant has the type of its parent enumeration, which is some integral type that can represent all the enumeration constant values defined in the enumeration.

Character Constants
The mapping of characters in the source character set to characters in the execution character set is implementation-defined.

The value of a character constant that contains a character or escape sequence not represented in the execution character set is implementation-defined.

The meaning of an unspecified escape sequence (except for a backslash followed by a lowercase letter) in a character constant or string literal is implementation-defined. Note that unspecified sequences with a lowercase letter are reserved for future use by Standard C. This means that a conforming-implementation is quite free to provide semantics for  (for the ASCII Escape character, for example) but it should not do so for .

Recommendation: Avoid using non-standard escape sequences in character constants.

The value of a character constant that contains more than one character is implementation-defined. On a 32-bit machine, it may be possible to pack four characters into a word using. On a 16-bit machine, something like  might be permitted.

Recommendation: Avoid using multi-character constants, as their internal representation is implementation-defined.

Standard C supports the earlier popular extension of a hexadecimal-form character constant. This commonly has the form  or   where   is a hexadecimal digit.

K&amp;R declared that if the character following the backslash is not one of those specified, the backslash is ignored. Standard C says that the behavior is undefined.

Unlike with some older implementations, as Standard C does not permit the digits 8 and 9 in octal constants, previously supported characters such as  take on new meaning.

To avoid confusion with trigraphs (which have the form  ), the character constant  was defined by C89. An existing constant of the form  will now have different meaning.

Recommendation: Because of differing character sets, use graphic representations of a character instead of its internal representation. For example, use  instead of   in ASCII environments.

Some implementations may allow  to represent the null character—Standard C does not.

K&amp;R did not define the constant, although it clearly is necessary inside of literal strings. In Standard C, the characters  and   are equivalent.

C89 added the notion of a wide character constant, which is written just like a character constant, but with a leading .

C99 added support for Universal character names in character constants. See Universal character names.

C11 added support for wide character constants with prefix  (or  ).

Standard C requires that an integer character constant have type.

C++ Consideration: Standard C++ requires that an integer character constant have type.

String Literals
Standard C permits string literals having the same spelling to be shared, but does not require that.

On some systems, string literals are stored in read-write memory, on others, in read-only memory. Standard C states that if a program attempts to modify a string literal, the behavior is undefined.

Recommendation: Even if your implementation allows it, do not modify literal strings, because this is counterintuitive to the programmer. Also, do not rely on like strings being shared. If you have code that modifies string literals, change it to a character array initialized to that string and then modify that array. Not only does this not require literals to be modified, but it also allows you to share like strings explicitly by using the same array elsewhere.

Recommendation: It is common to write something like the following:. Assuming you are using a C89-or-later compiler, instead, declare the pointer using  , so any attempt to modify the underlying string will be diagnosed.

C++ Consideration: Standard C++ requires that a string literal be implicitly -qualified, which means that the following often-used C-idiom is not valid C++:

This must be written instead as

The maximum length of a literal string is implementation-defined, but Standard C requires it to be at least 509 characters.

Unlike with some older implementations, as Standard C does not permit the digits 8 and 9 in octal constants, previously supported string literals such as  take on new meaning.

K&amp;R and Standard C permit a string literal to be continued across multiple source lines using the backslash/new-line convention as follows:

However, this requires that the continuation line begin exactly in the first column. An alternate approach is to use the string concatenation capability provided by C89 (and by some compilers prior to that), as follows:

C89 added the notion of a wide string literal, which is written just like a string literal, but with a leading  (e.g.,  ).

C99 added support for Universal character names in string literals. See Universal character names.

C11 added support for wide character string literals with prefix  (or  ), and for UTF–8 string literals via the prefix .

Punctuators
Archaic: Prior to K&amp;R, the compound-assignment operators were written as  op. However, K&amp;R and Standard C write them as op instead. For example,  became.

C89 added the ellipsis punctuator,, as part of the enhanced notation for function declarations and definitions. It also added the punctuators  and , which represent preprocessor-only operators.

C95 added the following digraph punctuators:,  ,  ,  ,  , and.

Header Names
Standard C defines a grammar for header names. If the characters , ,  , or   occur in an   directive of the form  , the behavior is undefined. The same is true for , , and   when using an   directive of the form.

In Standard C, when using the    form, the text   is not considered to be a string literal. In an environment using a hierarchical file system where one needs to use a  to indicate a different folder/directory level, this backslash is not the beginning of an escape sequence, so should not itself need to be escaped.

Comments
C99 added support for line-oriented comments, which begin with . Prior to C99, some implementations supported this as an extension.

Neither K&amp;R nor Standard C support nested comments, although a number of existing implementations do. The need for nested comments is primarily to allow a block of code containing comments to be disabled as follows:

The same affect can be achieved by using

Standard C requires that during tokenization, a comment be replaced by one space. Some implementations replace them with nothing and, therefore, allow some clever token pasting. See Source Tokens for an example.

= Conversions =

Boolean, Characters, and Integers
Whether a plain  is treated as signed or unsigned is implementation-defined.

At the time C89 was being developed, two different sets of arithmetic conversion rules were currently in use: unsigned preserving (UP) and value preserving (VP). With UP, if two smaller unsigned types (e.g.,    or    ) are present in an expression, they are widened to. That is, the widened value is also unsigned. The VP approach widens such values to    (provided they will fit), else it widens them to.

While the same result arises from both approaches almost all of the time, there can be a problem in the following situation. Here, we have a binary operator with one operand of type    (or    ) and an operand of   (or some narrower type). Consider that the program is running on a 16-bit twos-complement machine.

With UP rules,  will be promoted to     as will   with the result of       being an. With VP rules,  will be promoted to , the type of  , and the two will be added to produce a result of type. This in itself is not a problem, but if  were used as the object of a right-shift (as shown), or as an operand to ,  ,  ,  ,  , or  , different results are possible. For example:

UP rules produce:

VP rules produce:

UP rules cause zero bits to replace high-order bits if the expression has unsigned type, whereas the result is implementation-defined if the object is signed (due to arithmetic versus logical shift possibilities). In the second output example above, VP produced sign-bit propagation during the shift producing a quite different result.

Note that the above example only causes concern for certain values of  and , not in all cases. For example, if  were   and   were , the output would be:

UP rules produce:

VP rules produce:

In this case, the high bit (sign bit) of      is not set, so both UP and VP produce the same result.

Casts can be used in such mixed-mode arithmetic to ensure that the desired results are achieved regardless of the rule used. For example,

UP rules produce:

VP rules produce:

As demonstrated, the results of the two expressions containing the explicit casts are the same, even though the results are different without them.

Although Standard C uses VP rules, some widely used compilers prior to C89 used UP rules. Code that relies on UP rules may now give a different result. Specifically, a, a   or an   bit field (all of them signed or unsigned) or an enumerated type may be used wherever an   may be used. If an  can represent all values of the original type, the value is converted to an  ; otherwise, it is converted to an.

Note that with Standard C, the “normal” integral widening rules also apply to bit fields, and that bit fields can be signed as well as unsigned.

C99 added the type. C99 also allowed the addition of extended integer types.

Floating Types
Standard C states, “When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined .”

Standard C states, “When a value of integer type is converted to a real floating type, if the value being converted can be represented exactly in the new type, it is unchanged. If the value being converted is in the range of values that can be represented but cannot be represented exactly, the result is either the nearest higher or nearest lower representable value, chosen in an implementation-defined manner. If the value being converted is outside the range of values that can be represented, the behavior is undefined .”

Standard C requires that when a  is truncated to a , or a     is truncated to a   or  , if the value being converted cannot be represented, the behavior is undefined. If the value is in the range, but cannot be represented exactly, the truncated result is one of the two nearest representable values—it is implementation-defined as to which one of the two is chosen.

Note that by using function prototypes an implementation may allow a  to be passed by value to a function without its first being widened to a. However, even though such narrow-type preservation is permitted by Standard C, it is not required.

Complex Types
C99 added the type  and its corresponding conversion rules, and the header.

Usual Arithmetic Conversions
These were changed by Standard C to accommodate the VP rules described in Boolean, Characters, and Integers. Expressions may also be evaluated in a “wider” mode than is actually necessary, to permit more efficient use of the hardware. Expressions may also be evaluated in a “narrower” type, provided they give the same result as if they were done in the “wide” mode.

If operands of a binary operator have different arithmetic types, this results in the promotion of one or both operands. The conversion rules defined by Standard C are similar to those defined by K&amp;R except that the VP rules are accommodated, some new types have been added, and narrow-type arithmetic is allowed without widening.

Pointers
C89 introduced the concept of a pointer to, written as. Such a pointer can be converted to a pointer to an object of any type without using a cast. An object pointer can be converted to a pointer to  and back again without loss of information.

C++ Consideration: Standard C++ requires a cast when assigning a  pointer to a pointer to an object type.

Object pointers need not all have the same size. For example, a pointer to  need not be the same size as a pointer to.

While a pointer to an object of one type can be converted to a pointer to a different object type, if the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined.

While an  and data pointers often occupy the same size storage, the two types are quite distinct, and nothing portable can be said about interchanging the two except that zero may be assigned or compared to a pointer. Note that this null pointer concept does not require the null pointer value to be “all-bits-zero,” although it may be implemented as such. All that Standard C requires is that      represent an address that will never equal the address of an object or function. In the expression    , zero is promoted to the type of   before being compared with it.

Conversion of an integer to any pointer type results in implementation-defined behavior. Likewise, for conversion in the other direction except that if the result cannot be represented in the integer type, the behavior is undefined.

A function pointer is quite distinct from a data pointer, and no assumptions should be made about the relative sizes of the two. The format and size of a function pointer may be quite different from that of a data pointer.

Standard C requires an explicit cast when a pointer to a function returning one type is assigned to a pointer to a function returning a different type. Standard C is even more restrictive when copying function pointers because of function prototypes. Now, the attributes of a pointer to a function not only involve that function's return type but also its argument list. While a pointer to a function of one type may be converted to a pointer to a function of another type and back again, if a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.

= Expressions =

Order of Evaluation and Sequence Points
According to Standard C, the order in which expressions are evaluated is unspecified except for the function call operator, the logical OR operator  , the logical AND operator  , the comma operator, and the conditional operator . While the precedence table defines operator precedence and associativity, these can be overridden by grouping parentheses. However, according to K&amp;R, the commutative and associative binary operators (, ,  ,  , and  ) may be arbitrarily rearranged even if grouping parentheses are present. (Note that for,  , and  , the ordering is unimportant because the same result is always obtained.) However, Standard C requires grouping parentheses to be honored in all expressions.

With K&amp;R (but not Standard C) rules, even though you may write the following:

the expression may be evaluated as

or even

This can cause overflow on intermediate values if the expression is evaluated one way versus another. To force a specific order of evaluation, break up the expression into multiple statements and use intermediate variables, as follows:

These examples cause a problem only in “boundary” conditions for integer types and even then, only on some machines. For example, integer arithmetic on a twos-complement machine is usually “well behaved.” (However, some machines raise an interrupt when integer overflow occurs and presumably, this should be avoided.)

Recommendation: If you are concerned about the order of evaluation of expressions that associate and commute, break them into separate expressions such that you can control the order. Find out the properties of integer arithmetic overflow for your target systems and see if they affect such expressions.

The potential for overflow and loss of precision errors is much higher with floating-point operands where it is impossible to represent accurately certain real numbers in a finite space. Some mathematical laws that do not always hold true using finite representation are:

When expressions involve side effects, the order of evaluation may be important. For example,

Here,  may be evaluated before or after. While the value of  might be the same in either case, if   and   produce side effects, this may not be true.

The order in which side effects take place is unspecified. For example, the following are expressions with unpredictable outcomes:

In each line above, as to which expression containing  is evaluated first is unspecified.

Recommendation: Even if you can determine how your compiler evaluates expressions that contain side effects, don't rely on this being true for future releases of the same product. It may even vary for the same compiler given different circumstances. For example, by changing the source code in other, possibly unrelated, ways, you may change the optimizer's view of the world such that it generates different code for the same expression. The compiler writer is under no obligation whatsoever to support predictable behavior because the behavior is allowed to be undefined.

C89 introduced the notion of full expressions and sequence points, as follows: “A full expression is an expression that is not part of another expression, nor part of a declarator or abstract declarator. There is also an implicit full expression in which the non-constant size expressions for a variably modified type are evaluated; within that full expression, the evaluation of different size expressions are unsequenced with respect to one another. There is a sequence point between the evaluation of a full expression and the evaluation of the next full expression to be evaluated.”

Recommendation: Make sure you can identify all the sequence points in your code.

The results of bitwise operations (using,  ,  ,  ,  , and  ) on signed types are inherently implementation-defined.

Recommendation: As the outcome of bitwise operations depends on the representation of integral types you should determine the nature of shift and bit-masking operations, particularly for signed types.

The properties of floating-point arithmetic are implementation-defined. Bear in mind, too, that there may be differences between results obtained with software emulation and hardware execution. Also, a machine may have several different floating-point formats, any one of which might be able to be selected via a compile-time switch.

Recommendation: When using floating-point data types in expressions, identify the size, range, and representation of each such type. Also, determine if there are differences between floating-point emulation in software and the results produced by floating-point hardware. See if you can determine whether floating-point hardware is available at run-time.

Regarding floating-point expression evaluation, C99 added the following: “A floating expression may be contracted, that is, evaluated as though it were a single operation, thereby omitting rounding errors implied by the source code and the expression evaluation method. The  pragma in   provides a way to disallow contracted expressions. Otherwise, whether and how expressions are contracted is implementation-defined .”

If an arithmetic operation is invalid (e.g., division by zero) or produces a result that cannot be represented in the space provided (e.g., overflow or underflow), the result is undefined.

Primary Expressions
A parenthesized expression is a primary expression. C89 required support for at least 32 nesting levels of parenthesized expressions within a full expression. C99 increased that to 63.

A generic selection operation is a primary expression. This operator was introduced by C11, and involves the keyword.

Array Subscripting
The format of an array reference is  where   and   are expressions. One of these expressions must have type pointer to some type (other than ), while the other expression must be of integral type. Neither K&amp;R nor Standard C require  to be the pointer expression and   to be the integer expression, even though that is almost always the way a subscript expression is written. Specifically,  can also be written as , which may be surprising to many people, including C veterans.

C does not require that the integral expression in a subscript have an unsigned value—it may be signed. For example,

Recommendation: For any given object A defined to be an array, never subscript A with a value other than 0 through, where   is the maximum number of elements defined to be in A.

Recommendation: It is OK to use a negative subscript with a pointer expression provided that the expression maps into a predictable place.

The following example demonstrates the technique of having arrays begin at any arbitrary subscript. (Note though that this technique is not supported by Standard C and might not work on some implementations—those running on segmented memory architectures may cause it to fail because not all pointer arithmetic behaves in a “wraparound” manner.)

By making  point to ,   has subscripts 1 to 5. It is irrelevant that no space has been allocated for the element  because we never try to access that element. All we have done is invent a pointer expression that points to the location where  would be if it existed. Then when we have an expression, which equals   or  , it gives  ,  , and finally  , which is the same as. That is,  and   are interchangeable, and   through   map into the array .

The use of the pointer  takes the same idea further and allows   to be used like an array with subscripts ranging from 1983 to 1987. The same idea would allow an array to have subscripts -1004 to -1000, simply by initializing a pointer to.

This works on some “well-behaved” machines having a linear address space. Here, address arithmetic is unsigned so that subtracting 10 from address 6 gives, not -4, but a large unsigned address. That is, the address arithmetic “wraps around” at both the high and the low end. While this may not be the case on every conceivable machine, it certainly works on many common ones.

Standard C says that if the result of an arithmetic operation on a pointer points inside an array or to the (non-existent) element one beyond the end, it’s OK. Otherwise, the behavior is undefined ; that is,  -   +   need not result in  !

It is possible to portably calculate the size of each dimension of an array by knowing only the number of dimensions.

is an array of four elements, so  divided by   is 4. Note that the type of  is not   , it is. That is,  is a pointer to an array of four  s and   is.

Similarly,  is an array of three elements each of which is an array of four  s. And finally,   is an array of two elements, each of which is an array of three elements, each of which is an array of four  s.

Function Calls
C99 required that a function declaration be in scope for each call to that function.

If a function call has no function prototype declarator in scope, and the number of arguments, or their types, after the default conversions do not match those of the formal parameters, the behavior is undefined.

If a function that accepts a variable number of arguments is called, and no prototype declarator with the ellipsis notation is in scope, the behavior is undefined.

Recommendation: Whenever you use variable length argument lists in functions, document it thoroughly and use the  (or  ) header as appropriate. Always declare such functions using a prototype with the appropriate ellipsis notation before you call them.

The order in which function arguments are evaluated is unspecified. For example,

contains an unsafe argument list;  may be evaluated before.

Now consider an extension of that example, which involves an array of function pointers:

Not only is the order in which the arguments are evaluated unspecified, so too is the order in which the expression designating the function is called. Specifically, we can’t sure for sure which element of the array is used! What Standard C does guarantee is that there is a sequence point at the point of the function call; that is, after all three expressions have been evaluated.

Recommendation: Never rely on the order of evaluation of the arguments in a function call or the expression designating the function being called.

A function that has not been explicitly declared is treated as though it were declared with class  and as returning type.

Recommendation: When porting, take care that the headers in the target environment contain the necessary function declarations; otherwise, function calls will be interpreted as returning s, whereas they would not be if a function prototype were in scope. For example, Standard C declares  and   (and the   family) in.

Problems can occur when porting code that uses integral constants as function arguments. For example,

This program works properly on a machine with 32-bit s. But on a 16-bit machine, the actual argument to   will be a   , while   will be expecting an  , two quite different types.

Recommendation: Take care when passing integral constants as function arguments because the type of such a constant depends on its magnitude and the limits of the current implementation. This kind of problem may be difficult to find if the constant is hidden in a macro (such as ). Use casts to make sure argument types match, or call functions in the presence of a prototype.

Standard C permits structures and unions to be passed by value. However, maximum size of a structure or union that may be passed by value is implementation-defined.

C89 required that an implementation allow at least 31 arguments in a function call. C99 increased that to 127. K&amp;R placed no minimum limit.

Standard C permits pointers to functions to be used to invoke functions using either  or. The latter format makes the call look like a normal function call, although presumably it will cause less sophisticated source cross-reference utilities to assume that  is a function name rather than a function pointer.

Recommendation: When invoking a function via a pointer, use the format  rather than , because the latter is a Standard C invention.

Recommendation: A function prototype can be used to alter the argument widening and passing mechanisms used when a function is called. Make sure that the same prototype is in scope for all calls as well as the definition.

Recommendation: Standard C requires that a strictly conforming program always have a prototype in scope (with a trailing ) when calling a function with a variable number of arguments. Therefore, when using the  and   family of routines, always. If you don't, the behavior is undefined.

While C supports recursion, it is unspecified as to how many levels any function can recurse before stack (or other) resources might be exhausted.

Structure and Union Members
Due to the addition in C89 of structure (and union) argument passing and returning by value and structure (and union) assignment, structure (and union) expressions can exist.

K&amp;R stated that in,   may be either a pointer to a structure (or union) or an absolute machine address. Standard C requires that each structure and union have its own member name space. This requires that the first operand of the or   operators must have type structure (or union) or pointer to structure (or union), respectively.

On some machines, the hardware I/O page is mapped into physical memory, so device registers look just like regular memory to any task that can map to this area. To access an offset—a structure member named, for example—from a specific physical address, previously you could use an expression of the form

With each structure and union now having its own member name space, the  member can no longer be accessed in this way. Instead, the physical address must be converted to a structure pointer, so the offset reference is unambiguous, as follows:

When a union is accessed using a member other than that used to store the immediately previous value, the result is implementation-defined. No assumptions can be made about the degree of overlap of members in a union unless a union contains several structures, each of which has the same initial member sequence. In this special case, members in the common sequence of any of the structures can be inspected provided that the union currently contains one of those structures. For example,

If the union currently contains a structure of type  or , the particular type being stored can reliably be determined by inspecting either   or. Both members are guaranteed to map to the same area.

Standard C says, “Accessing a member of an atomic structure or union object results in undefined behavior .”

Postfix Increment and Decrement Operators
Some (very old) implementations considered the result of post-increment and post-decrement operator expressions to be modifiable lvalues. This is not recognized by Standard C. Therefore,  should generate an error.

Compound Literals
C99 added support for compound literals.

C++ Consideration: Standard C++ does not support compound literals.

Prefix Increment and Decrement Operators
Some (very old) implementations considered the result of pre-increment and pre-decrement operator expressions to be modifiable lvalues. This is not recognized by Standard C. Therefore,  should generate an error.

Address and Indirection Operators
If an invalid array reference (one with a subscript “out of range”), null pointer dereference, or dereference to an object declared with automatic storage duration in a terminated block occurs or if allocated space that has been freed is accessed, the behavior is undefined. Note that depending on how the null pointer is implemented, dereferencing it might cause catastrophic results. For example, on one implementation, an attempt to access a location within the first 512 bytes of an image generates a fatal “access violation.”

In Standard C, the use of the  operator with function names is superfluous.

When the passing of structures and unions by value was added to the language, using  with a structure or union name was no longer superfluous—its absence means “value” and its presence means “pointer to.”

Some implementations accept bit-field and return the address of the object in which the bit field is packed. This is not permitted by K&amp;R nor supported by Standard C.

Some implementations allow register-variable, in which case the   class is ignored. This is not permitted by K&amp;R nor supported by Standard C.

Some implementations allow you to take the address of a constant expression under special circumstances, such as in function argument lists. This is not permitted by K&amp;R nor supported by Standard C.

Dereferencing a pointer may cause a fatal run-time error if the pointer was cast from some other pointer type and alignment criteria were violated. For example, consider a 16-bit machine that requires all scalar objects other than  be aligned on word  boundaries. As such, if you cast a  pointer containing an odd address to an   pointer, and you dereference the   pointer, a fatal “odd address” trap error will result.

Unary Arithmetic Operators
The unary plus operator was a C89 invention.

Note carefully, that when using twos-complement representation for negative integers, negating  quietly results in the same value,  ; there simply is no positive equivalent of that value! (Likewise, for  and  .)

The Operator
Until C99, the result of  was a compile-time constant. However, starting with C99, if the operand is a variable-length array, the operand is evaluated at runtime.

What is the type of the result produced by ? It would seem reasonable that one could use  to find the size of the largest object an implementation supports, which could be an array of   that is very large, or perhaps an array with a very large number of large structs, for example. Certainly, it seems reasonable that  produce an unsigned integer result, but which?

In very old implementations the type of  was   (which is signed). C89 stated, “its type (an unsigned integral type) is  defined in the   header.” (See Common Definitions for more information.)

So how then to display the result using ? Consider the following, where type is some arbitrary data type:

Case 1 is portable for sizes up to  with value 65535; case 2 is portable for sizes up to   with value 4294967295; case 3 is portable for sizes up to   with value 18446744073709551615; and case 4 is maximally portable, provided your implementation supports the length modifier   (introduced in C99).

Recommendation: Always use a prototype when calling functions that expect  type arguments so that the arguments you supply can be implicitly cast by the prototype if necessary. However, because the function prototype for  contains an ellipsis for the trailing arguments, no implicit conversion can be specified.

The Operator
This was added by C11, which stated, “its type (an unsigned integral type) is  defined in the   header.”

The header ) contains a macro called   that expands to.

C++ Consideration: The equivalent (but different) keyword added in C++11 is, which Standard C defines as a macro in.

Cast Operators
The result of casting a pointer to an integer or vice versa (except for the value zero) is implementation-defined as is the result of casting one pointer type to a pointer type of more strict alignment.

For a detailed discussion on the conversions allowed between dissimilar data pointers and dissimilar function pointers, refer to Pointers.

C11 added the restriction that a pointer type cannot be converted to any floating type, and vice versa.

Explicit casting may be necessary to get the right answer because of “unsigned preserving” versus “value preserving” conversion rules (see Boolean, Characters, and Integers.

A number of the Standard C library functions return    values, and this is reflected in their corresponding prototypes. As    is compatible with all other data pointer types, you will not need to explicitly cast the value returned.

C++ Consideration: Converting from  to a data pointer type requires an explicit cast.

Using an elaborate series of casts, it is possible to write a “fairly” portable expression that will produce the offset (in bytes) of a specific member of a structure. However, this might not work on some implementations, particularly those running on word architectures. For example:

Recommendation: Standard C provides the macro  (in  ) to portably find the offset of a member within a structure. This macro should be used instead of any home-grown mechanism, where possible.

Do not assume that zero cast to a pointer type results in a value that has all-bits zero. However, a pointer with the value 0 (produced either by assignment or casting) must compare equal to zero.

Multiplicative Operators
According to Standard C, integer and floating-point division can result in undefined behavior. C89 introduced some implementation-defined behavior in the integer-division case, but that was removed in C99.

Additive Operators
If an integer is added to or subtracted from a pointer that is not pointing to a member of an array object (or to the non-existent element one beyond the end, the result is undefined . Standard C permits an integer to be subtracted from a pointer pointing to the element immediately beyond the last element in an array, provided the resultant address maps into the same array.

The length of the integer required to hold the difference between two pointers to members of the same array is implementation-defined. Standard C provides the type synonym  to represent the type of such a value. This signed integral type is defined in.

Bitwise Shift Operators
The result of a shift by a negative number or by an amount greater than or equal to the width in bits of the expression being shifted is undefined.

If the left-operand is signed and has a nonnegative value, and left-operand × 2right-operand is not representable in the result type, the behavior is undefined.

If the left-operand is signed and has a negative value, the resulting value is implementation-defined.

The widening rules of unsigned preserving and value preserving can cause different results with the  operator. With UP rules,            is the same as dividing by 2, whereas with VP, it is not because the type of the expression to be shifted is signed.

Relational Operators
If you compare pointers that are not pointing to the same aggregate, the result is undefined. “Same aggregate” means members of the same structure or elements in the same array. Notwithstanding this, Standard C endorses the widespread practice of allowing a pointer to be incremented one place beyond an object.

The widening rules of unsigned preserving and value preserving can cause different results with the,  ,   and   operators.

Equality Operators
A pointer may be compared to 0. However, the behavior is implementation-defined when a nonzero integral value is compared to a pointer.

Structures and unions may not be compared except by member. Depending on the presence of, and contents of, holes, structures might be able to be compared for equality using the library function.

Take care when using these operators with floating-point operands because most floating-point values can be stored only approximately.

Bitwise AND Operator
By their very nature, the values of bit masks might depend on the size/representation of integers.

Bitwise Exclusive OR Operator
By their very nature, the values of bit masks might depend on the size/representation of integers.

Bitwise Inclusive OR Operator
By their very nature, the values of bit masks might depend on the size/representation of integers.

Logical AND Operator
Standard C defines a sequence point between the evaluations of the first and second operands.

Logical OR Operator
Standard C defines a sequence point between the evaluations of the first and second operands.

Conditional Operator
Standard C defines a sequence point between the evaluation of the first operand and the evaluation of the second or third operand (whichever is evaluated).

Simple Assignment
Assigning a zero-valued integer constant expression to any pointer type is portable, but assigning any other arithmetic value is not.

The effect of assigning one pointer type to a more strictly aligned pointer type is implementation-defined.

Standard C requires an explicit cast to assign a pointer of one object type to a pointer of another object type. (A cast is not needed when assigning to or from a  pointer.)

Standard C permits a structure (or union) to be assigned only to a like typed structure (or union).

If an object is assigned to an overlapping object, the result is undefined. (This might be done with different members of a union, for example.) To assign one member of a union to another, go through a temporary variable.

Compound Assignment
Assignment operators of the (very old) form op are not supported by Standard C. (K&amp;R hinted that they were already archaic back in 1978.)

The result of the following expression is unpredictable because the order of evaluation of operands is undefined.

This can be resolved by using a compound assignment operator, as follows:

because you are guaranteed that the left-hand operand is evaluated only once.

Comma Operator
Standard C defines a sequence point between the evaluations of the first and second operands.

Constant Expressions
Static initializer expressions are permitted to be evaluated during program startup rather than at compile-time.

The translation environment must use at least as much precision as the execution environment. If it uses more, a static value initialized at compile-time may have a different value than if it were initialized during startup on the target machine.

C89 introduced,    , and   integral constants. C99 introduced signed/unsigned    integral constants.

C99 added support for floating-point constants with binary exponents.

Standard C permits an implementation to support forms of constant expression beyond those defined by the standard (to accommodate other/extended types). However, compilers differ in their treatment of those constant expressions: some are treated as integer constant expressions.

= Declarations =

Perhaps the biggest impact C89 had on the C language was in the area of declarations. New type-related keywords were added along with terminology to classify them. The most significant aspect, from the programmer's viewpoint, was the adaptation of function prototypes (that is, new-style function declarations) from C++.

Ordering of Declaration Elements
A declaration may contain one or more of the following elements: storage-class specifier, type specifier, type qualifier, function specifier, and alignment specifier. Standard C permits these to be in any order; however, it does require any identifier list to come at the right end. As such,

can be rewritten as

or in any other combination, so long as  and its initializer come at the end. Similarly.

can be rewritten as

Some older compliers might require a specific order. It is debatable whether K&amp;R permitted arbitrary ordering of type specifiers. The grammar on page 192 of K&amp;R indicates that they are supported, but on page 193, it states, “the following [type specifier] combinations are acceptable:  ,    ,     and    .” It is unclear whether this should be taken as explicitly disallowing    ,    , etc.

Position within a Block
Prior to C99, at block scope, all declarations were required to precede all statements. However, that restriction was lifted in C99, which allowed them to be interspersed. C++ also allows this.

The Storage Class
It is rare to see  actually used in code, as Standard C local variables without an explicit storage class default to   storage class.

The method used to allocate, and the amount of storage available for, automatic variables is up to the implementation. Implementations that use a stack (or other) approach may place limits on the amount of space available for  objects. For example, 16-bit machines may limit the stack to 64 KB or, if the entire address space is 64 KB, the sum of code, static data, and stack might be 64 KB. In that case, as the size of the code or static data grows, the stack size decreases, perhaps to the point where sufficient  space cannot be allocated.

Some implementations check for the possibility of stack overflow when each function is entered. That is, they check the amount of stack space available before allocating that required for the function. And if insufficient space is available, they terminate the program. Some implementations actually call a function to perform the check, in which case, each time you call one of your functions having automatic class variables, you are implicitly calling another function as well.

Recommendation: On implementations that “probe the stack” each time a function is called, there might be a compile-time switch allowing such checking to be disabled. While the disabling of such checking can increase the amount of stack available, possibly to the extent of allowing a program to run when it wouldn't otherwise, it is strongly suggested you not do so during testing.

Consider the following auto declarations:

The location in memory of these four variables relative to each other is unspecified and can change between compilations on the same or different systems. However, we are guaranteed that the 10 elements in array  are contiguous, with addresses in ascending order.

Recommendation: Do not rely on an implementation to have a particular  space allocation scheme. In particularly, don't rely on  variables being allocated space in exactly the same order in which they are declared.

C++ Consideration: C++11 discontinued support for  as a storage class specifier, and gave it new semantics, as a type specifier.

The Storage Class
The  storage-class is a hint to the implementation to place the object where it can be accessed as “fast as possible.” Such a location is typically a machine register. The number of  objects that can actually be placed in registers and the set of supported types are implementation-defined. An object with storage class  that cannot be stored in a register, for whatever reason, is treated as though it had storage class. Standard C permits any data declaration to have this storage class. It also allows this storage class to be applied to function parameters.

K&amp;R stated “... only variables of certain types will be stored in registers; on the PDP-11, they are,   and pointer.”

Recommendation: Given the advances in compiler optimization technology, the value of the  storage class on hosted implementations has largely evaporated. (In fact, this was predicted in K&amp;R, which stated, “... future improvements in code generation may render [register declarations] unnecessary.”) It is, therefore, suggested you not use them at all, unless you can prove they are providing some value for one or more of your target implementations.

Standard C does not permit an implementation to widen the allocation space for a variable with the  storage class. That is, a    cannot be treated as if it were. It must behave in all ways as a, even if it is stored in a register whose size is wider than a. (Some implementations can actually store more than one    object in the same register.)

C++ Consideration: While support for this storage class existed through C++14, its use was deprecated. In C++17, the keyword  is unused, but it is reserved for future use (presumably with different semantics).

The Storage Class
A problem can occur when trying to forward-reference a  function, as follows:

Function  has a block scope declaration in which   is declared to be a   function. This allows  to call the   function   rather than any   function by the same name. Standard C does not permit such declarations. It does, however, allow function declarations with file scope to have storage class, as follows:

Recommendation: Do not use the  storage class on a block scope function declaration even if your compiler allows it.

The Storage Class
This was added by C11. (See .)

C++ Consideration: The equivalent (but different) keyword added in C++11 is, which Standard C defines as a macro in.

Here’s how to determine if your compiler supports thread-local storage duration:

Type Specifiers
C89 added the following keywords for use in type specifiers:,  , and. These gave rise to the following base-type declarations:

C89 also added support for the following new type declarations (some implementations had already supported    and    ):

C99 added support for the following:

Standard C states that whether a plain char (one without the  or   modifier) is treated as signed or unsigned is implementation-defined.

While K&amp;R permitted    to be a synonym for , this practice is not supported by Standard C.

Prior to C99, a type specifier could be omitted with  being assumed; for example, in the file-scope declarations

C99 prohibited this.

C99 added support for a Boolean type via the type specifier. (See, which includes a workaround if this header is not available.)

C++ Consideration: The equivalent (but different) keyword in Standard C is, which Standard C defines as a macro in.

C99 added the type specifier, which gave rise to the types    ,    , and. (See .)

C11 added the type specifier, but made it optional; see the conditionally defined macro   in Conditionally Defined Standard Macros. (See .)

Representation, Size, and Alignment
The macros in  and   define the minimum range and precision for the arithmetic types. Standard C requires the following:

<ul> <li>

– large enough to store the values 0 and 1

</li> <li>

– at least 8 bits

</li> <li>

– at least 16 bits

</li> <li>

– at least 16 bits

</li> <li>

– at least 32 bits

</li> <li>

– at least 64 bits

</li> <li>

The range and precision of  must be less than or equal to that for , which in turn, must be less than or equal for. All three types could have the same size and representation, all different, or some overlap thereof.

</li></ul>

For integer values, a conforming implementation is permitted to use ones-complement, two-complement, or signed magnitude representation. The minimum limits for signed integer types allow ones-complement. Although an implementation having 32-bit s and using twos-complement can conform by defining   to have the value -, it is not unreasonable to expect it would instead use a value of - , to accurately reflect the type’s twos-complement nature.

Type  is often represented using 32-bit single precision, type   as 64-bit double precision, and type     also as 64-bit double precision. However, on systems having a separate, extended precision,    may be mapped to 80 or 128 bits.

Note carefully that it may be unreasonable to expect identical results from floating-point computations from a program even when it runs on multiple processors having the same size and representation of floating-point types (such as occurs with multiple IEEE-based systems). For example, on early Intel floating-point processors, all calculations were done in 80-bit extended mode, which can result in different values than if two s were added using strict (64-bit)   mode. Rounding modes also come into play.

Recommendation: With regard to floating-point calculations, set reasonable expectations for the reproducibility of results across different floating-point hardware and software libraries.

Standard  C does not require that  be recognized as an operator in preprocessor   arithmetic expressions.

Conditionally compiling based on machine word-size is common. Here, the example assumes, perhaps, that if it isn't running on a 16-bit system, it's on a 32-bit machine. To achieve the same result, you must now use something like

The  compile-time operator reports the number of  s-worth of memory that are occupied by an object of some given data type. If we multiply this by the  macro   we find the number of bits allocated. However, not all bits allocated for an object need be used to represent that object’s value! Following are some examples that demonstrate this:

Case 1: Early machines from Cray Research used a 64-bit, word-addressable architecture. When a    was declared, although 64 bits were allocated (  resulted in 8), only 24 or 32 bits were actually used to represent the  ’s value.

Case 2: Intel floating-point processors support 32-bit single precision, 64-bit double precision, and 80-bit extended precision. As such, a compiler targeting that machine might map,  , and    , respectively, to these three representations. If so, one might assume that    would be 10, and that might be true. However, for performance reasons, a compiler might choose to align such objects on 32-bit boundaries, resulting in 12 bytes being allocated, with two of them going unused.

Case 3: During the deliberations of C89, the issue arose as to whether integer types required a binary representation, and the committee decided that they did. As such, the description was written something like “… each physically adjacent bit represents the next-highest power of two.” However, a committee member reported that his company had a 16-bit processor on which when two 16-bit words were used to represent a  , the high bit of the low word was not used. Essentially, there was a 1-bit hole in the middle, and shifting left or right took that into account! (Even though 31 bits is insufficient to represent a    in Standard C, the implementation in question was a viable one for applications targeting its intended market.)

Case 4: For alignment purposes, holes (unused bits, that is) might occur in a structure between fields or after the final field, and inside containers of bitfields.

Recommendation: Do not assume or hard code the size of any object type; obtain that size using, and use the macros in  ,  , and  , as appropriate.

Recommendation: Do not assume that the unused bits allocated to an object have a specific/predictable value.

Although it might be common for all data and function pointer types to have the same size and representation, which might also be that of an integer type, that is not required by Standard C. Some machines used addresses that look like signed integers, in which case, address zero is in the middle of the address space. (On such machines, the null pointer likely will not have a value of “all-bits-zero.” On some segmented-memory architectures, both near (16-bit) and far (32-bit) pointers might be supported. What Standard C requires is that all data and function pointer values can be represented by the type  .

Recommendation: Unless you have a very specialized application, assume that every pointer type has a unique representation, which is different to that of any integer type, and don’t assume that the null value for any pointer type has a value of “all-bits-zero.”

Some programs inspect and perhaps manipulate the bits in an object by creating a union of it with some integer type. Obviously, this relies on implementation-defined behavior. According to Standard C, with respect to such type punning, “One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence, and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible.”

Structure and Union Specifiers
While K&amp;R did not restrict the types that could be used with bit-fields, C89 allowed,     and     only, and stated, “Whether the high-order bit position of a “plain”   bit-field is treated as a sign bit is implementation-defined .”

C99 states, “A bit-field shall have a type that is a qualified or unqualified version of,    ,    , or some other implementation-defined type.” C11 added, “It is implementation-defined whether atomic types are permitted.”

K&amp;R required that consecutive bit-fields be packed into machine integers and that they not span word boundaries. Standard C declares that the container object in which bit-fields are packed is implementation-defined. The same is true for whether bit-fields span container boundaries. Standard C lets the order of allocation of bit-fields within a container be implementation-defined.

Standard C permits bit-fields to exist in unions without their first being declared as part of a structure, as follows:

K&amp;R required that all the members in a union “begin at offset 0.” Standard C spells it out even more precisely by saying that a pointer to a union, suitably cast, points to each member, and vice versa. (If any of the members is a bit-field, the pointer points to the container in which that bit-field resides.)

C11 added support for anonymous structures, anonymous unions, and flexible array members.

Enumeration Specifiers
Enumerated types were not part of K&amp;R; however, some compilers implemented them well before C89 existed.

According to the C Standard, “Each enumerated type shall be compatible with, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined, but shall be capable of representing the values of all the members of the enumeration.”

Recommendation: Do not assume that an enumerated type is represented as an —it may be any integral type.

Note that Standard C requires enumeration constants to have type. Therefore, the type of an enumerated data object need not be the same as that of its members.

C99 added support for a trailing comma after an enumerator list, as in

C++ Consideration: Standard C++ extends enumerated types by allowing them to be given a specific base type (representation, that is), and by restricting the scope of an enumeration’s constants to just that enumeration type.

Atomic Type Specifiers
These are not permitted unless the implementation supports atomic types, which is determined by testing if the conditionally defined macro  is the integer constant 1. (See .)

The  type specifier has the form     type-name , and is not to be confused with the   type qualifier (Type Qualifiers), which involves just that name only.

C++ Consideration: C++ does not support. However, it does define header, which gives access to various kinds of atomic-related support.

Type Qualifiers
C89 added the  type qualifier, borrowing it from C++. C89 also added the  type qualifier.

Attempting to modify a  object by means of a pointer to a type without the   qualifier, results in undefined behavior.

Recommendation: Do not attempt to modify a  object by means of a pointer to a type without the   qualifier.

Attempting to reference a  object by a pointer to a type without the   qualifier results in undefined behavior.

Recommendation: Do not access a  object by means of a pointer to a type without the   qualifier.

C99 added the  type qualifier, and applied it to various library functions, as appropriate.

C++ Consideration: C++ does not support.

C11 added the type qualifier, which is not to be confused with the   type specifier (Atomic Type Specifiers).

Function Specifiers
C99 added the function specifier. This is a suggestion to the compiler, and the extent to which that hint is followed, is implementation-defined. Prior to C99, some compilers supported this capability via the keyword.

Standard C permits both an inline definition and an external definition for a function, in which case, it is unspecified whether a call to the function uses the inline definition or the external definition.

C11 added the function specifier. It also provided the header, which contains a macro called   that expands to.

C++ Consideration: The equivalent (but different) approach to  added in C++11 is the attribute.

Alignment Specifier
C11 added support for alignment specifiers using the keyword.

The header Alignment contains a macro called  that expands to.

C++ Consideration: The equivalent (but different) keyword added in C++11 is.

Standard C states that, “If declarations of an object in different translation units have different alignment specifiers, the behavior is undefined .”

General Information
Both K&amp;R and Standard C treat a declarator in parentheses as equivalent to one without. For example, the following is syntactically correct.

The second declaration may be used to hide the function declaration from a macro with arguments that has the same name as that function.

Standard C requires that a declaration support at least 12 pointer, array, and function derived declarators modifying a base type. For example,  has four modifiers. K&amp;R gave no limit except to say that multiple type modifiers may be present. (The original Ritchie compiler supported only six type modifiers in a declarator.)

Standard C requires that an array dimension have a positive, nonzero value. That is, an array may not have size zero, as permitted by some implementations.

Array Declarators
Standard C permits array declarations to be incomplete by omitting the size information as follows:

However, the use of such objects is restricted until size information is made available. For example,  and   are unknown and should generate an error.

C99 added the ability to have type qualifiers and the keyword  in a declaration of a function parameter with an array type.

C++ Consideration: Standard C++ does not support these things in array-type declarators.

C99 added support for variable-length arrays (VLAs) and required such support. However, C11 made VLAs conditional; see the conditionally defined macro  in Conditionally Defined Standard Macros.

C++ Consideration: Standard C++ does not support variable-length arrays.

Calling Non-C Functions
Some implementations allow a  type specifier (extension) to be used in a function declaration to indicate that function linkage suitable for Fortran (call by reference) is to be generated or that different representations for external names are to be generated. Others provide  and   keywords for calling Pascal and C routines, respectively. Standard C does not provide any external linkage mechanism.

C++ Consideration: Standard C++ defined an    linkage.

Function Prototypes
Borrowing from C++, C89 introduced a new way of declaring and defining a function, which places the parameter information inside the parameter list. This approach uses what is colloquially called a function protype. For example, what used to be written as

can now be written as

Standard C continues to support the old style.

C++ Consideration: Standard C++ requires function prototype notation.

While you may well have production source code using the old-style of function definitions, these can co-exist with new-style function prototypes. The only potential catch is with narrow types. For example, an old-style definition having parameters of type,  , or   would expect arguments passed in their wider forms,  ,  , and  , respectively, which might not be the case if a prototype were in scope containing the narrow types.

Recommendation: Whenever possible, use function prototypes, as they can make sure functions are called with the correct argument types. Prototypes can also perform conversion of arguments. For example, calling    without a prototype in scope, and passing in 0, does not cause that zero to be converted to   , which on some systems, might cause a problem.

Standard C requires that all calls to functions having a variable argument list be made only in the presence of a prototype. Specifically, the following well-known program from K&amp;R is not a conforming program:

The reason for this is that, in the absence of a prototype, the compiler is permitted to assume that the number of arguments is fixed. Therefore, it may use registers or some other (presumably) more efficient method of passing arguments than it would otherwise use. Clearly, the  function is expecting a variable argument list. Typically, it would not be able to communicate properly with code calling, if the calling code were compiled with the fixed list assumption. To correct the above example, you must either  (the preferred approach) or explicitly write a prototype for   (including the trailing ellipsis) in the example prior to the function's use. [The function should also be given an explicit return type of .]

Recommendation: Always have a prototype in scope when calling a function having a variable argument list. Make sure the prototype contains the ellipsis notation.

It is permitted to have a dummy identifier name in prototype declarators; however, using them can cause problems as the following program demonstrates:

Although the scope of the identifier  in the prototype begins at its declaration and ends at the end of the prototype, that name is seen by the preprocessor. Consequently, it is replaced by the constant 10, thus generating a syntax error. Even worse, if the macro  were defined as , the prototype would be quietly changed from having a parameter of   to one of a pointer to an. A similar problem can occur if your implementation's standard headers use dummy names that are part of the programmer's name space (i.e., without leading underscores.)

Recommendation: If you must put identifiers in prototypes, name them so that they won't conflict with macro names. This can be avoided if you always spell macros in uppercase and all other identifiers in lowercase.

The declaration    tells the compile that f is a function return int, but no information is known about the number and type of its parameters. On the other hand,    indicates there are no parameters.

C++ Consideration: The declarations    and     are equivalent.

Initialization
If the value of an uninitialized object that has automatic storage duration is used before a value is assigned, the behavior is undefined.

External and  variables not explicitly initialized are assigned the value of 0, cast to their type. (This may differ from the area allocated by , which is initialized to all-bits-zero.)

K&amp;R did not allow automatic arrays, structures, and unions to be initialized. Standard C does, however, provided the initializing expressions in any initializer list are constant expressions, and no variable-length arrays are involved. An automatic structure or union can also be initialized with a (nonconstant) expression of the same type.

Standard C permits a union to be initialized explicitly. The value is stored in the union by casting it to the type of the first member specified, so member declaration order can be important! Using this rule, we see that if a  or external union is not explicitly initialized, it contains 0 cast into the first member (which may not result in all-bits-zero, as stated above).

Standard C permits automatic structures and unions to have initializers that are struct or union valued expressions.

Standard C permits bit-fields to be initialized. For example,

K&amp;R and Standard C require that the number of expressions in an initializer be less than or equal to the number expected, but never more. There is one case, however, where it is possible to specify implicitly one too many, yet not get a compilation error. For example,

Here, the array text is initialized with the characters,  ,  ,  , and   and does not contain a trailing .

Some implementations allow a trailing comma in an initialization list. This practice is endorsed by Standard C, and was permitted by K&amp;R.

C99 added support for designated initializers.

C++ Consideration: Standard C++ does not support designated initializers.

Static Assertions
C11 added support for static assertions. It also added to the header   a macro called   that expands to.

= External Definitions =

Matching External Definitions and Their Declarations
While K&amp;R defined a model to define and reference external objects, numerous other models were also employed, and this led to some confusion. These models are described in subordinate sections below.

Standard C adopted a model that is a combination of the strict ref/def and initialization models. This approach was taken to accommodate as wide a range of environments and existing implementations as possible.

Standard C states that if an identifier with external linkage has incompatible declarations in two source files, the behavior is undefined.

Some implementations cause object modules to be loaded into an executable image simply if one or more of the external identifiers defined in them are declared in user code yet are not actually used. Standard C states that if an identifier with external linkage is not used in an expression, then there need be no external definition for it. That is, you can't force an object to be loaded simply by declaring it!

The Strict ref/def Model
With this model, the declaration of  may occur once and only once without the keyword. All other references to that external must have the keyword. This is the model specified by K&amp;R.

The Relaxed ref/def Model
In this case, neither declaration of  includes the   keyword. If the identifier is declared (somewhere) with the  class, a defining instance must occur elsewhere in the program. If the identifier is declared with an initializer, one and only one declaration must occur with an initializer in the program. This model is widely used in UNIX-like environments. Programs that adopt this model conform to Standard C, but are not maximally portable.

The Common Model
In this model, all declarations of the external variable  may optionally contain the keyword. This model is intended to mimic that of Fortran's  blocks.

The Initializer Model
Here, the defining instance is that containing an explicit initializer (even if that initializer is the default value).

Tentative Object Definitions
Standard C introduced the notion of tentative object definitions. That is, a declaration may be a definition depending on what follows it. For example,

Here, the first references of  and   are tentative definitions. If they were not followed by a declaration for the same identifier containing an initializer list, these tentative definitions would be treated as definitions. However, as shown, they are followed by such declarations, so they are treated as declarations. The purpose of this is to allow two mutually referential variables to be initialized to point to each other.

= Statements =

Labeled Statements
K&amp;R had labels and “ordinary” identifiers sharing the same namespace. That is, a label name was hidden if an identifier with the same name was declared in a subordinate block. For example,

would generate a compilation error because the target of the  statement is an identifier declared to be an   variable, not a label.

In Standard C, labels have their own namespace allowing the above example to be compiled without error.

K&amp;R specified that the length of significance in an internal identifier (such as a label) was eight characters.

Standard C requires at least 63 characters of significance in an internal identifier, such as a label.

Compound Statement (Block)
Prior to C99, all declarations within a block had to precede all statements. However, starting with C99, the two can be interspersed.

C++ Consideration: C++ allows declarations and statements to be interspersed.

A  or   can be used to jump into a block. While doing so is portable, whether any “bypassed” automatic variables in the block are initialized predictably is not.

K&amp;R permitted blocks to nest, but it gave no indication as to how deeply.

C89 required compound statements to nest to at least 15 levels. C99 increased this to 127.

Expression and Null Statements
Consider the following example that uses the  type qualifier (added by C89):

Optimizers must tread very carefully when dealing with objects having the  qualifier, because they can make no assumptions about the current state of such an object. In the simplest case, an implementation might evaluate every expression containing a  expression simply because doing so might generate an action visible in the environment. For example, the statement  could generate code to access. That is, it might place the address of  on the bus such that it can be seen by hardware waiting to synchronize on such an access. Note that even if an implementation does this, it should not generate code for the statement  because   is not itself   and evaluating the expression   does not involve accessing a   object.

Recommendation: Do not rely on expressions statements such as,  , and   to generate code. Even if  is a   object, it is not guaranteed that the   object would be accessed as a result.

Selection Statements
Regarding nested limits on selection statements, see Compound Statement.

The Statement
As the controlling expression is a full expression, there is a sequence point immediately following it.

The Statement
K&amp;R required that the controlling expression and each  constant expression have type.

Standard C requires that the controlling expression have some integral type. Each  expression must also be of integral type, and each expression’s value is converted to the type of the controlling expression, if necessary.

As Standard C supports enumerated data types (which are represented by an integral type), it permits their use in  expressions and in   constant expressions. (Enumerated types are not defined in K&amp;R.) Some implementations have a notation for specifying a range of values for a  constant expression. Note that because several different and incompatible syntaxes are in use, this feature is not supported by Standard C.

Standard C permits a character constant to contain multiple characters, as in  and. Character constants are permitted in  constant expressions.

Recommendation: As the internal representation of multi-character character constants is implementation-defined, they should not be used in  constant expressions.

K&amp;R did not specify the maximum number of  values permitted in a   statement. C89 required support for at least 257 s for each   statement. C99 increased that to 1023.

As the controlling expression is a full expression, there is a sequence point immediately following it.

Refer to Compound Statement for a discussion of transferring into compound statements within  statements.

Iteration Statements
The controlling expressions in,  , and   statements may contain expressions of the form expr1   expr2. If expr1 and expr2< are floating-point expressions, equality may be difficult or impossible to achieve due to the implementation-defined nature of floating-point representation, rounding, etc.

Recommendation: If the controlling expressions in,   and   statements contain floating-point expressions, note that the results of floating-point equality tests are implementation-defined. It may be more desirable to have something like expr1   expr2    rather than expr1   expr2, for example.

Some programs contain “idle” loops; that is, loops that are intended to simply pass time, perhaps as a crude approximation of actual wall-clock time. For example:

To address the utility of such constructs, C11 added the following: “An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a  statement) its expression-3 [the expression evaluated after each iteration], may be assumed by the implementation to terminate.” In lay terms, this means that the compiler can throw away the whole loop, provided it implements any other side effects that loop contains (in this case, making sure   finishes up with the value 1000001).

Recommendation: Don’t use “idle” loops to simply pass time. Even if such loops are not optimized away, their execution time is very much dependent on factors like task/thread priority and processor speed.

Standard C guarantees at least 15 levels of nesting of selection control structures, iteration control structures, and compound statements. C99 increased this to 127. K&amp;R did not specify a minimum limit.

The Statement
As the controlling expression is a full expression, there is a sequence point immediately following it.

The Statement
As the controlling expression is a full expression, there is a sequence point immediately following it.

The Statement
As the three expressions are full expressions, there is a sequence point immediately following each one.

C99 added support for the first part of a  to be a declaration, as in       , rather than requiring   to already be defined.

C++ Consideration: Standard C++ supports this C99 feature as well. C++ differs from C regarding the scope of any variables declared in a  statement.

The Statement
Refer to Labeled Statements for a discussion of the implications of a separate label namespace. Refer to Compound Statement to learn about the ramifications of jumping into compound statements.

The Statement
When the form  expression  is used, as expression is a full expression, there is a sequence point immediately following it.

If the value of a function call is used, but no value is ed, the result is undefined except since C99 for , which has an implicit.

Standard C supports the  function type, which allows the compiler to ensure that a   function has no return value. K&amp;R did not include the  type.

Standard C supports the ability to  structures and unions by value. It places no constraint on the size of the object being ed, although the size of such objects that can be passed to a function by value may be limited. K&amp;R did not include the returning of structures and unions by value.

K&amp;R (pp 68 and 70) shows the general form of the  statement to be  expression  yet the formal definition on page 203 shows   expression. This may appear to be a contradiction. Page 203 is correct—the parentheses are not part of the syntax; they are merely redundant grouping parentheses and are part of the expression. The confusion comes from the fact that most (if not all) of the examples using  in K&amp;R have the returned value within parentheses. From a style point of view, the parentheses can be useful as they help to separate the  keyword from the expression, and they clearly delimit the expression if it is rather complex. However, they are never needed. (Note that in the second edition of K&amp;R, the parentheses have been removed from the examples, and, often,  is terminated using.

C99 added the following constraint: “A  statement without an expression shall only appear in a function whose return type is  .”

= The Preprocessor =

According to the original C Standard Rationale document (which was written as C89 was developed), “Perhaps the most undesirable diversity among existing C implementations can be found in preprocessing. Admittedly a distinct and primitive language superimposed upon C, the preprocessing commands accreted over time, with little central direction, and with even less precision in their documentation.”

Preprocessor versus Compiler
Many C compilers involve multiple passes, the first of which often contains the preprocessor. Using this knowledge, a compiler can often take short cuts by arranging information to be shared between the preprocessor and the various phases of the compiler proper. While this may be a useful feature for a particular implementation, you should keep in mind that other implementations may use completely separate, and noncooperating, programs for the preprocessor and the compiler.

Recommendation: Keep the ideas of preprocessing and compilation separate. One possible problem when you fail to do this will be demonstrated when the  operator is used as discussed later.

Although C is a free-format language, the preprocessor need not be because, strictly speaking, it is not part of the C language. The language and the preprocessor each have their own grammars, constraints, and semantics. Both are defined by Standard C.

The Directive Name Format
A preprocessing directive always begins with a  character. However, not all preprocessors require the  and the directive name to be a single token. That is, the  prefix may be separated from the directive name by spaces and/or horizontal tabs.

K&amp;R shows the  as part of the directive name, with no intervening white space. No statement is made as to whether such white space is permitted.

Standard C permits an arbitrary number of horizontal tabs and spaces between the  and the directive name, which are considered to be separate preprocessing tokens.

Start Position of Directives
Many preprocessors permit directives to be preceded by white space allowing indenting of nested directives. Less flexible preprocessors require the  character to be the first character of a source line.

K&amp;R states that “Lines beginning with  communicate with this preprocessor.” No definition for “beginning with” is given.

Standard C permits an arbitrary amount of white space before the  character. This white space is not restricted to horizontal tabs and spaces—any white space is allowed.

White Space Within Directives
Standard C requires that all white space appearing between the directive name and the directive's terminating new-line be horizontal tabs and/or spaces.

K&amp;R makes no statement about the validity or nature of such embedded white space.

If you use at least one white space character to separate tokens in a directive, the actual number of such characters (and the mix of tabs and spaces) is almost always immaterial to the preprocessor. An exception has to do with benign redefinition of macros using the  directive. This is discussed later in this chapter.

Macro Expansion Within a Directive
According to Standard C, “The preprocessing tokens within a preprocessing directive are not subject to macro expansion unless otherwise stated. [For] example, in

the sequence of preprocessing tokens on the second line is not a preprocessing directive, because it does not begin with a  at the start of translation phase 4 (see Phases of Translation), even though it will do so after the macro   has been replaced.”

Directive Continuation Lines
K&amp;R declared that macro definitions (with and without arguments) could be continued across multiple source lines if all lines to be continued contained a backslash immediately preceding the terminating new-line.

Standard C has generalized this notion and permits any token (not just those seen by the preprocessor, but by the language as well) to be split up/continued using the backslash/new-line sequence.

In the following case, the second source line starting with  does not begin a macro definition directive because it is a continuation line and the , therefore, is preceded by other than spaces and/or horizontal tabs.

Trailing Tokens
Strictly speaking, the preprocessor should diagnose any tokens in excess of those expected. However, some implementations process only the tokens they expect, then ignore any tokens remaining on the directive line. If this is the case, the source line

(which seems to indicate that a new-line was somehow omitted, perhaps lost during conversion for porting) would cause the header to be included. However, the macro definition will be ignored. Another example is

In this case, the file is never opened regardless of whether  is defined.

K&amp;R gives no indication as to what should happen in these cases.

Standard C requires a diagnostic if excess tokens are present.

Comments in Directives
Delimited comments are treated as a single space so they can occur anywhere that white space can. As all, or various kinds of, white space can occur in preprocessing directives, so too can delimited comments. For example, in the directives

each delimited comment is replaced by a single space during preprocessing. While the first two directives should port without error, the last three have leading horizontal white space, something not universally accepted, as noted earlier.

Of course, such delimited comments can occur between directive tokens.

Note that delimited comments can be continued indefinitely across source lines without requiring backslash/new-line terminators.

Line-oriented comments may also be used with directives.

Phases of Translation
Standard C contains a detailed discussion of the manner and order in which source text is translated into tokens for processing by the compiler. Prior to C89 there were no hard and fast rules governing this area, allowing code such as the following to be interpreted in different ways by different preprocessors:

The intent here, perhaps, is to disable the  function call by having   become the start of a comment whenever   is not defined. As one programmer put it, “To define  as   we need to fool the preprocessor, because it detects comments before doing anything else. To do this, we place the asterisk on a continuation line. As the preprocessor doesn't see the token, everything works as expected. It works fine with C compilers in UNIX environments.”

But does the preprocessor detect comments before doing anything else? As the answer to this question varies by implementation, let's look at what Standard C says. The phases of translation, as they affect the preprocessor, follow:

<ul> <li>

Backslash/new-line pairs are removed so that continuation lines are spliced together.

</li> <li>

The source is broken into preprocessing tokens and sequences of white space characters (including comments).

</li> <li>

Each comment is replaced with a space character. However, whether consecutive white space characters are compressed to one such character is implementation-defined.

</li> <li>

Preprocessing directives are executed, and macro invocations are expanded. For each header included here, the steps outlined, are followed over again.

</li></ul>

A Standard C compiler, therefore, is obliged to diagnose an error when given the previous code because the  directive will be included in the comment started on the macro definition line.

Some implementations expand macros before looking for preprocessor commands thus accepting the following code:

This is disallowed by Standard C.

Inspecting Preprocessor Output
Some implementations have a preprocessor separate from the compiler, in which case, an intermediate text file is produced. Other implementations, which combine the preprocessor and compiler, have a listing option that allows the final effect of all directives to appear in the compilation listing file. They may also allow intermediate expansions of macros whose definitions contain other macros to be listed. Note that some implementations are not able to preserve comments or white space when saving the intermediate code because comments may already have been reduced to spaces prior to the preprocessor directives being processed. This is the case with Standard C's phases of translation.

Recommendation: See which of your implementations allows the output of the preprocessor to be saved. One particularly useful quality assurance step is to compare the output text files produced by each of your preprocessors. This allows you to check if they expand macros and conditionally include code in the correct manner. So, when you transport a source file to a new environment, you may also wish to transport the preprocessed version of that file.

Source File Inclusion
The  directive is used to treat the contents of the named header as if it were in-line as part of the source file being processed. A header need not correspond exactly to a text file (or be of the same name), although it often does.

Standard C requires that a header contain complete tokens. Specifically, you may not put only the start or finish of a comment, string literal, or character constant in a header. A header must also end with a new-line. This means that you cannot paste tokens together across s.

To help avoid name conflicts between standard headers and programmer code, Standard C requires implementers to begin their identifiers with two underscores or an underscore and an uppercase letter. The index in K&amp;R contains only three macros—, , and. It also lists some 20–30 library functions. No others are mentioned or required. Standard C, on the other hand, contains hundreds of reserved identifiers, most of which are macros or library function names. Add to that the system-related identifiers used by your compiler and those identifiers used by any third-party libraries, and you have a potential for naming conflicts.

Recommendation: For each of your target environments, generate a reserved identifier list in sorted alphabetical order by header, and across headers. Use this list of identifiers for two purposes: names to stay away from when inventing your own identifier names and to find the union of all sets so you know what names are common and can be used meaningfully in common code. Note that just because a macro of the same name appears in different environments does not mean it is used for the same purpose. For names you invent, use some unique prefix (not leading underscores), suffix, or naming style so the likelihood of conflict is reduced.

Directive Format
K&amp;R and Standard C define the following two forms of the  directive. For directives of the form

header-name

K&amp;R stated, “the header is searched for first in the directory of the original source file, then in a sequence of standard places.” Standard C states that the header is searched for in an implementation-defined manner.

K&amp;R and Standard C require that only the implementation-defined standard places are searched for directives of the form

header-name

C89 added a third form,

identifier

provided  ultimately translates to the form   or. As a macro name is an identifier, this format allows a header name to be constructed or otherwise defined either using the token-pasting preprocessing operator  or by defining the macro on the compiler command line. Many compilers support a command-line argument of the form  or , which is equivalent to having     in the source being compiled.

If your target compilers support the  (or  ) option discussed above and the     format, you can specify a header's full device/directory path name at compilation-time rather than hard-code that information into the   directive.

One technique to help isolate hard-coded header locations follows. A master header contains

Now, if this header is included in another header, these macro names can be used as follows:

If you move the code to another system or you move the headers to a different location on the same system, you simply modify the  header and recompile all modules that include it.

Header Names
The format and spelling of the header name in the  and   formats is implementation-dependent.

A peculiar problem occurs with file systems that use the backslash to separate subdirectory and file names in file pathnames. The completely qualified name of a DOS disk file has the following format:

The problem arises with directory and filenames such as

Here, either the directory or the file name (or both) begin with the sequence , where x is a recognizable special character sequence within a C literal string. The problem then becomes, “How do I name this header when including it?”

According to Standard C, although a header written as  looks like a string literal, it is not! As such, its contents must be taken verbatim.

Recommendation: If possible, avoid embedding file system device, directory, and subdirectory information in header names.

When creating a new header and headers map directly to file names on your system, keep in mind the limits on file naming across systems. For example, some file systems are case-sensitive, in which case STDIO.H, stdio.h, and Stdio.h could be three separate files.

C89 stated, “The implementation may ignore the distinctions of alphabetical case and restrict the mapping to six significant characters before the period.” C99 increased the number of significant characters to eight.

Nested Headers
A header may contain  directives. The level of header file nesting permitted is implementation-defined. K&amp;R states that headers may be nested, but gives no minimum requirement. Standard C requires at least eight levels of header-nesting capability.

Recommendation: If headers are designed properly, they should be able to be included multiple times and in any order. That is, each header should be made self-sufficient by having it include any headers it relies on. Put only related things in a header and restrict nesting to three, or at most four, levels. Use  wrappers around the contents of a header so they are not included more than once in the same scope.

Path Specifiers
K&amp;R and Standard C provide only two (main) mechanisms to specify header location search paths, namely,  and. Sometimes it is necessary or desirable to have more than this, or perhaps for testing purposes, you temporarily want to use some other location instead. Many implementations allow one or more include search paths to be specified as command-line arguments at compile-time. For example,

tells the preprocessor to first search for  format headers using   then   and   and, finally, in some system default location. The lack of this feature or the support of less than the required number of paths may cause problems when porting code. Even though your compiler may support a sufficient number of these arguments, the maximum size of the command-line buffer may be such that it will be too small to accommodate a number of verbose path specifiers.

Recommendation: If this capability is present in all your implementations, check the number of paths supported by each.

Modification of Standard Headers
Recommendation: Do not modify standard headers by adding to them definitions, declarations, or other s. Instead, create your own miscellaneous or local header and include that in all the relevant places. When upgrading to new compiler versions or when moving to different compilers, no extra work need be done beyond making that local header available as before.

Macro Replacement
The  directive is used to associate a string definition with a macro name. As a macro name is an identifier, it is subject to the same naming constraints as other identifiers. K&amp;R required eight characters of significance, and Standard C requires 31.

Recommendation: Use the lowest common denominator length of significance for macro names.

Recommendation: Regarding spelling macro names the most common convention is to use uppercase letters, digits, and underscores only.

Standard C requires that tokens specified as a part of a macro definition must be well formed (i.e., complete). Therefore, a macro definition cannot contain just the start or end part of a comment, literal string, or character constant.

Some compilers allow partial tokens in macros such that when the macro is expanded, it is pasted to the token preceding and/or following it.

Recommendation: Avoid having partial tokens in macro definitions.

The definition of a macro might contain an arithmetic expression such as

The preprocessor does not recognize this as an expression, but rather as a sequence of tokens that it substitutes wherever the macro is called. It is not permitted to treat the definition of  as if it were

Preprocessor arithmetic comes into play only with the conditional inclusion directive. However, the original definition of  above will be treated as an expression if the following code is used:

This will be expanded to

then

An implementation might place a limit on the size of a macro's definition.

Recommendation: If you plan on having macros whose definitions are longer than 80 characters, test your environments to see what their limits are.

Macros with Arguments
A macro with arguments has the general form

K&amp;R does not state the maximum number of arguments allowed.

Standard C requires support for at least 31 arguments.

Recommendation: If you plan on using macros with more than four or five arguments, check the limits of your target implementations.

While no white space is permitted between the macro name and the left parenthesis that begins the argument list in a macro definition, this constraint is not present in macro calls.

There is no requirement that all of the arguments specified in a macro definition argument list must appear within that macro's definition.

C99 added support for macros with a variable number of arguments (via ellipses notation and the special identifier ).

Rescanning Macro Names
A macro definition can refer to another macro, in which case, that definition is rescanned, as necessary.

Standard C requires the definition of a macro to be “turned off” for the duration of the expansion of that macro so that “recursive death” is not suffered. That is, a macro name that appears within its own definition is not re-expanded. This allows the name of a macro to be passed as an argument to another macro.

Replacement Within String Literals and Character Constants
Some implementations allow macro arguments within string literals and character constants to be replaced as follows:

Then the macro call

is expanded to

On implementations that do not allow this, the macro would be expanded to

K&amp;R states that “text inside a string or character constant is not subject to replacement.”

Standard C does not support the replacement of macro arguments within strings and character constants. However, it does supply the (C89 addition) stringize operator, so that the same effect can be achieved. For example,

expands to

and because Standard C permits adjacent strings to be concatenated, this becomes

Command-Line Macro Definition
Many compilers allow macros to be defined using a command-line argument of the form  or , which is equivalent to having       in the source being compiled. Some compilers allow macros with arguments to be defined in this manner.

The size of the command-line buffer, or the number of command-line arguments, may be such that there is insufficient room to specify all the required macro definitions, particularly if you use this mechanism to specify identifiers used in numerous  directives.

Recommendation: Qualify if this capability is present in all your implementations. At least five or six identifiers should be supported provided you keep the lengths of each to a minimum. (Note that if you use 31-character identifier names, you might exceed your command-line buffer size.)

Macro Redefinition
Many implementations permit an existing macro to be redefined without its first being ed. The purpose of this (generally) is to allow the same macro definition to occur in multiple headers, all of which are included in the same scope. However, if one or more of the definitions is not the same as the others, a serious problem can occur. For example,

causes the first part of the code to be compiled using zero as the value for  and the last part using zero. This can cause serious problems when using  because the size of the object passed to   might not the same as that expected by.

Standard C allows a macro to be redefined provided the definitions are the same. This is known as benign redefinition. Just what does “the same” mean? Basically, it requires that the macro definitions be spelled EXACTLY the same, and depending on how white space between tokens is processed, multiple consecutive white space characters may be significant. For example,

Macros 1 and 2 are the same. Macros 3 and 4 might also be the same as 1 and 2 depending on the handling of the white space. Macro 5 would definitely be flagged as an error. Note that this does not solve the problem of having different definitions for the same macro that are not in the same scope.

Recommendation: It is legitimate to have a macro defined exactly the same in multiple places (typically in headers). In fact, this idea is encouraged for reasons stated elsewhere. However, avoid having different definitions for the same macro. As the use of multiple, consecutive white space characters may result in different spellings (as in macros 3 and 4 above), you should separate tokens by only one white space character, and the character you use should be consistent. As horizontal tabs may be converted to spaces, space separators are suggested.

By macro redefinition, we mean either the redefinition of a macro without arguments to a macro of the same name, also without arguments, or the redefinition of a macro with arguments with the same macro name with the same number and spelling of arguments.

Recommendation: Even if your implementation allows it, do not redefine a macro without arguments to one with arguments or vice versa. This is not supported by Standard C.

Predefined Standard Macros
Standard C specifies the following predefined macros:

<ul> <li>

C89 – Date of compilation

</li> <li>

C89 – Name of the source file being compiled; however, no mention is made as to whether this name is a fully qualified path name

</li> <li>

C89 – Current line number in the source file being compiled

</li> <li>

C89 – Has the value 1 if the compiler conforms to some edition of Standard C (see ). Don’t assume that the presence of this name implies conformance; that requires the value 1. An implementation might define this macro as 0 to indicate “not quite conforming,” or as 2 to indicate “contains extensions.” To determine if a compiler complies with C89, check that  is defined to 1 and that   is not defined

</li> <li>

C99 – Indicates if the implementation is hosted or free-standing

</li> <li>

C95 – The Standard C edition to which this compiler conforms (see ), as follows: C95 199409L, C99 199901L, C11 201112L, and C17 201710L.

</li> <li>

C89 – Time of compilation

</li></ul>

Attempting to  or   any of these predefined names results in undefined behavior.

Macro names beginning with  are reserved for future standardization.

K&amp;R contained no predefined macros. and  were available in some implementations prior to C89, as were   and  ; however, the date string format varied.

Standard C requires that “any other predefined macro names begin with a leading underscore followed by an uppercase letter or a second underscore.” It also prohibits the definition of the macro  (either predefined or in a standard header).

C++ Consideration: Standard C++ predefines, which expands much like   by encoding a version number. Also, whether a standard-conforming C++ implementation predefines  or   is implementation-defined.

Macros defined via a compiler command line option are not considered to be predefined macros, even though conceptually they are defined prior to the source being processed.

Except for the macros specified by Standard C, all other predefined macros names are implementation-defined. There is no established set of names, but the GNU C compiler provides a large and rich set that other implementations may well emulate.

A conforming implementation might conditionally define other macros (see Conditionally Defined Standard Macros).

Conditionally Defined Standard Macros
Standard C permits, but does not require, the following macros to also be predefined:

<ul> <li>

C11

</li> <li>

C99

</li> <li>

C99 an implementation that defines  must not also define

</li> <li>

C99 Also defined by Standard C++

</li> <li>

C11

</li> <li>

C11

</li> <li>

C11

</li> <li>

C11

</li> <li>

C11

</li> <li>

C11

</li> <li>

C11

</li> <li>

C11

</li></ul>

Macro Definition Limit
The maximum number of entries that can fit in an implementation's preprocessor symbol table may vary considerably as can the amount of total string space available for macro definitions.

C89 required that at least 1024 (4095 in C99 and later) macro identifiers be able to be simultaneously defined in a source file (including all included headers). While this guarantee may allow that many macros, a conforming implementation might require that each macro definition be restricted in length. It certainly does not guarantee that many macro definitions of unlimited length and complexity.

K&amp;R makes no statement about the limit on the number or size of concurrent macro definitions.

Recommendation: If you expect to have a large number (greater than a few hundred) of concurrent macro definitions, write a program that can generate test headers containing macros of arbitrary number and complexity, to see what each of your implementations can handle. There is also some incentive to include only those headers that need be included and to modularize headers such that they contain only related material. It is perfectly acceptable to have the same macro definition in multiple headers. For example, some implementers define  in several headers just so the whole of   need not be preprocessed for one macro name.

Stacking Macro Definitions
Some implementations allow the stacking of macros. That is, if a macro name is in scope and a macro of the same name is defined, the second definition hides the first. If the second definition is removed, the first definition is back in scope again. For example,

Standard C does not permit the stacking of macro definitions.

K&amp;R states that the use of  “causes the identifier's preprocessor definition to be forgotten,” presumably forgotten completely.

The Stringize Operator
This was a C89 invention.

C99 added support for empty macro arguments, which each result in the string .

The order of evaluation of  and   operators is unspecified.

The Token-Pasting Operator
This was a C89 invention. It allows a macro expansion to construct a token that can then be rescanned. For example, with a macro definition of

the macro call

generates the code

A common solution to this problem prior to Standard C follows:

Here, instead of the definition being  (because the comment is replaced with a space), the implementation made it , thus forming a new token that was then rescanned. This practice is not supported by either K&amp;R or Standard C. The Standard C approach to this is

where the spaces around the  operator are optional.

Standard C specifies that in        , the order of evaluation is implementation-defined.

An interesting situation exists in the following example:

Here, the macro expands producing, which looks perhaps as if it should generate a syntax error because 1000 is not a lvalue. However, Standard C resolves this with its “phases of translation” by requiring that the preprocessing tokens  and   retain their meaning when handed off to the compiler. That is, the two minus signs are not recognized as the autodecrement token,, even though they are adjacent in the expanded text stream. A non-Standard implementation, however, might rescan the text producing a different token sequence by pasting the two  tokens together.

Recommendation: To avoid such macro definitions being misinterpreted, surround them with parentheses, as in.

The order of evaluation of  and   operators is unspecified.

Redefining Keywords
Some implementations (including Standard C) permit C language keywords to be redefined. For example,

Recommendation: Do not gratuitously redefine language keywords.

The Directive
can be used to remove a library macro to get access to a real function. If a macro version does not exist, Standard C requires the  to be ignored because a nonexistent macro can be the subject of an   without error.

Refer to Stacking Macro Definitions for a discussion of the use of  in stacked macro implementations.

Standard C does not permit the predefined standard macros (Predefined Standard Macros) to be ed.

Conditional Inclusion
This capability is one of the most powerful parts of a C environment available for writing code that is to run on different target systems.

Recommendation: Make as much use as possible of conditional inclusion directives. This is made easier if you have, or you establish, a meaningful set of macros that distinguish one target environment from another. See  and   for details of host characteristics.

Arithmetic
The target of an  directive is a constant expression that is tested against the value 0.

Some implementations allow the  operator to be used in the constant expression as follows:

Strictly speaking, the preprocessor is a macro processor and string substitution program and need not have any knowledge about data types or the C language. Remember that  is a C language compile-time operator, and at this stage, we are preprocessing, not compiling.

K&amp;R uses the same definition of constant expression for the preprocessor as it does for the language, thus implying that  is permitted here. No mention is made of the use of casts in constant expressions (even in the language).

Standard C requires that the constant expression not contain a cast or an enumeration constant. With Standard C, whether the  operator is supported in this context is implementation-defined. That is, while it is permitted, it is not guaranteed. Note that if an enumeration constant were present, it would be treated as an unknown macro and, as such, would default to a value of 0.

Recommendation: Do not use, casts, or enumeration constants in conditional constant expressions. To get around the inability of using, you may be able to determine certain attributes about your environment by using the header.

C89 states that, “… the controlling constant expression which is evaluated according to the rules of using arithmetic that has at least the ranges specified in Numerical Limits, except that  and     act as if they have the same representation as, respectively,   and    .”

C99 changed this to, “… the controlling constant expression, which is evaluated according to the rules of 6.6, except that all signed integer types and all unsigned integer types act as if they have the same representation as, respectively, the types  and   defined in the header  .”

Floating-point constants are not permitted.

Recommendation: Do not rely on underflow or overflow because arithmetic properties vary widely on ones- and twos-complement and packed-decimal machines. Do not use the right-shift operator if signed operands are present because the result is implementation-defined when the sign bit is set.

A character constant can legitimately be part of a constant expression (where it is treated as an integer). Character constants can contain any arbitrary bit pattern (by using  or  ). Some implementations support character constants whose value was negative (e.g., and   have their high bits set).

Standard C states that whether or not a single-character character constant may have a negative value is implementation-defined. K&amp;R makes no statement.

Some implementations support multi-character constants, as does Standard C.

Recommendation: Do not use character constants whose value may be negative. Also, because the order and meaning of characters in multi-character constants is implementation-defined, do not use them in  constant expressions.

In Standard C, if the constant expression contains a macro name that is not currently defined, the macro is treated as if it were defined with the value 0. The macro name is only interpreted that way; it does not actually become defined with that value.

K&amp;R makes no provision for this case.

Recommendation: Do not use the fact that undefined macros evaluate to 0 in constant expressions. If a macro definition is omitted, either from a header or from the command line at compile-time, then using this default rule results in its being erroneously interpreted as being defined with the value 0. It is not always practical to test if a macro is defined first before using it. However, for macros expected to be defined on the command line, it is worth the check because it is very easy to omit the macro definition if you are typing the compile command line manually. To further help avoid such problems use command procedures or scripts to compile code, particularly when numerous and lengthy include paths and macros are present on the command line.

It is possible for the constant expression to produce an error, for example, if division by 0 is encountered. (This is possible if a macro name used as a denominator has not been defined and defaults to 0.) Some implementations may flag this as an error, while others won't. Some may continue processing, assuming the value of the whole expression is 0.

Recommendation: Do not assume your implementation will generate an error if it determines the  constant expression contains a mathematical error.

K&amp;R does not include the unary operator in the operators permitted within constant expressions. This is generally considered to be either an oversight or a typographical error.

The Operator
Sometimes it is necessary to have nested conditional inclusion constructs such as

This is supported by both K&amp;R and Standard C. Standard  C (and some implementations prior to C89) provides the  preprocessor unary operator to make this construct more elegant. For example,

Standard  C essentially reserves the identifier —it may not be used elsewhere as a macro name.

Recommendation: Do not use the  operator unless all your environments support it.

The Directive
The following cumbersome construct is also commonly used in writing portable code. It is supported by K&amp;R and Standard  C.

The directive  simplifies nested  s greatly as follows.

Recommendation: Do not use the  directive unless all your environments support it.

Nested Conditional Directives
Standard C guarantees at least eight levels of nesting.

K&amp;R states that these directives may be nested but gives no guaranteed minimum.

Recommendation: Use no more than two or three levels of nesting with conditional directives unless all of your implementations permit more.

Line Control
The syntax of the  directive is (ultimately) one of the following:

where the line number and filename are used to update the  and   predefined macros, respectively.

Standard C allows either a macro name or string literal in place of the filename. It also permits a macro in place of the line number, provided its value is a decimal-digit sequence (in which any leading zero is redundant and does not mean “octal”). In fact, any preprocessing token may follow  provided that after macro expansion, one of the two forms is present.

Implementations differ in the value of  if it is used in an item (preprocessor directive or macro invocation) that spans more than one physical line.

The Null Directive
Standard C permits a null directive of the form

This directive has no effect and is typically found only in machine-generated code. While it has existed in implementations for many years, it was not defined in K&amp;R.

The Directive
was a C89 invention. The intent of this directive is to provide a mechanism for implementations to extend the preprocessor's syntax. This is possible because the preprocessor ignores any pragma it does not recognize. The syntax and semantics of a  directive are implementation-defined, although the general format is

Possible uses of pragmas are to control compilation listing pagination and line format, to enable and disable optimization, and to activate and deactivate -like checking. Implementers can invent pragmas for whatever purpose they desire.

A pragma directive of the form

is reserved for use by Standard C, such as the pragmas,  , and CX_LIMITED_RANGE (all added by C99).

Pragma operator
C99 added this unary, preprocessor-only operator, which has the following form:

The Directive
This is a C89 invention. Its format is

and it causes the implementation to generate a diagnostic message that includes the token sequence specified.

One possible use is to report on macros you expected to be defined, but which were found not to be. For example, you are porting code containing variable-length arrays (or threading), but the conditionally defined macros (Conditionally Defined Standard Macros)  (or  ) is defined.

Non-Standard Directives
Some implementations accept other preprocessor directives. As these extensions typically relate to the implementation's specific environment, they have little or no utility in other environments. Therefore, they must be identified in code that is to be ported and implemented in some other way, if at all.

= Library Introduction =

Definition of Terms
With K&amp;R, a character had type  and a string was an array of. C89 introduced the notion of multibyte strings and shift sequences, along with wide characters (having type ) and wide strings (having type  ). The C89 library also included functions to process all of these, and subsequent editions of the standard added more headers and functions.

Prior to C89, the C library operated in a so-called “USA-English” mode, in which, for example, the decimal point used by  was a period. C89 introduced the notion of a locale such that the traditional C environment up unto that time is defined by the  locale; it also defined the header. The behavior of some Standard C library functions is affected by the current locale; that is, they are locale-specific.

Required Contents
C89 defined the following headers:,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  , and.

C95 added,  , and.

C99 added,  ,  ,  ,  , and.

C11 added,  ,  ,  , and

C17 added no new headers.

These headers are defined as having lowercase names and must be correctly located by a conforming implementation using the above spellings. Although some file systems support mixed-case file names, you should not spell the standard header names in any other way than defined by the standard.

Each header is self-contained. That is, it contains all the declarations and definitions needed to invoke the routines declared within it. That said, a header does not necessarily contain all the macro definitions whose value can be returned by its functions. For example,  in   could return a value of , and   may be stored in  ; yet these macros are not defined in. To use them, both  and   must be included as well.

In order to be self-contained, several headers define the same names (such as  and  ).

All functions in the Standard C library are declared using function prototypes.

Standard headers may be included in any order and multiple times in the same scope without producing ill effects. The one exception is, which if included multiple times, can behave differently depending on the existence of the macro.

To be strictly conforming, Standard C prohibits a program from including a standard header from inside an external declaration or definition. This means you should not include a standard header from within a function because a function is an external definition.

Many of the prototypes for the standard library functions contain keywords and derived types invented or adopted by C89. These include,  ,  , and. Where these are applied to functions that have existed for a number of years, they remain compatible with calls to those functions in pre-C89 times.

Standard C requires that a hosted C implementation support all the standard headers defined for that edition of the standard. In C89, a freestanding implementation needed to provide only,  ,  , and. C95 added  C99 added   and. C11 added  and. C17 added no new requirements.

Optional Contents
C11 added an annex called “Bounds-checking interfaces” that “specifies a series of optional extensions that can be useful in the mitigation of security vulnerabilities in programs, and comprise new functions, macros, and types declared or defined in existing standard headers.”

If an implementation defines the macro, it must provide all the optional extensions from that annex. These extensions apply to the following headers:,  ,  ,  ,  ,  ,  , and.

An implementation that defines the macro  allows the associated library extensions to be excluded by having a program     to be 0 prior to  ing a standard header containing such extensions. If instead  is defined to be 1, those extensions are enabled.

Reserved Identifiers
All external identifiers declared in the standard headers are reserved, whether or not their associated header is referenced. That is, don't presume that just because you never include  that you can safely define your own external function called. Note that macros and  names are not included in this reservation because they are not external names.

External identifiers that begin with an underscore are reserved. All other library identifiers should begin with two underscores or an underscore followed by an uppercase letter.

Use of Library Functions
Not all the Standard library routines validate their input arguments. In such cases, if you pass in an invalid argument, the behavior is undefined.

An implementation is permitted to implement any required routine as a macro, defined in the appropriate header, provided that macro expands “safely.” That is, no ill effects should be observed if arguments with side-effects are used. If you include a standard header, you should not explicitly declare any routine you plan to use from that header because any macro version of that routine defined in the header will cause your declaration to be expanded (probably incorrectly or producing syntax errors.)

You should take care when using the address of a library routine because it may currently be defined as a macro. Therefore, you should  that name first or reference it using   instead of just. Note that it is possible to call both a macro and a function version of the same routine in the same scope without first having to use.

When using a library routine, it is strongly suggested you include the appropriate header. If you choose not to, you should explicitly declare the function yourself using prototype notation, especially for routines such as, that take variable argument lists. The reason for doing this is that the compiler may pass arguments using a different mechanism when prototypes are used than when they are not. For example, with the correct prototype in scope, the compiler knows exactly how many arguments are expected and their types. And for fixed length argument lists, it may choose to pass the first two or three (or so) arguments in registers instead of on the stack. Therefore, if you compile your code without prototypes and the library is compiled with them, the linkage may fail.

Non-Standard Headers
The de facto Standard C library originally provided with UNIX systems contains both general-purpose and operating system-specific routines. Almost all the general-purpose ones were adopted by C89, while most of the operating system-specific ones were picked up by the IEEE POSIX Committee. A few were wanted by both or neither groups, and were divided up amicably between the two groups. It should be noted that a (very) few macros and functions are defined and declared differently by both groups. In particular, their versions of  are not identical. However, it is the intent of both standards groups that a C program be able to be both ISO C- and POSIX-conforming at the same time.

Numerous commonly provided headers are not included in the Standard C library. These include,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  , and.

Other headers whose names were not adopted by C89 have all or some of their capabilities made available through various Standard C headers. These include,  , and  , which were reborn in, or combined with,  ,  , and  , respectively.

= &lt;assert.h&gt; – Diagnostics =

C11 added support for static assertions (Static Assertions) part of which involved adding to this header a macro called.

C++ Consideration: The equivalent Standard C++ header is.

The Macro
Standard C requires that  be implemented as a macro, not as an actual function. If, however, that macro definition is ed to access an actual function, the behavior is undefined.

The format of the message output is implementation-defined. However, Standard C intends that the expression used as the argument to  be output in its text form (as it exists in the source code) along with the source filename and line number (represented by   and , respectively) of the invocation of the failing. Specifically, the expression      should be output as     , not   (assuming   is defined to be  , and  ,  ).

C89 required the argument passed to  have type. However, C99 broadened that to be any scalar type.

As assert is a macro, take care not to give it expressions that have side-effects—you cannot rely on the macro evaluating your expression only once.

= &lt;complex.h&gt; – Complex Arithmetic =

C99 added this header and made support for complex types and operations optional.

The absence of the optional predefined macro  indicates support for complex types and their associated arithmetic. Furthermore, the existence of the optional predefined macro  indicates that complex support conforms to IEC 60559, as described in an annex of the C Standard.

The following function names are reserved for possible future use by Standard C in this header:,  ,  ,  ,  ,  ,  ,  ,  , and those names with a suffix of   and .

C++ Consideration: The equivalent Standard C++ header is. Note that C++17 deprecated this header.

= &lt;ctype.h&gt; – Character Handling =

In Standard C, all the functions made available via  take an   argument. However, the value passed must either be representable in an    or be the macro. If an argument has any other value, the behavior is undefined.

C89 introduced the notion of a locale. By default, a C program runs in the  locale unless the   function has been called (or the implementation's normal operating default locale is other than  .) In the   locale, the   functions have the meaning they had prior to C89. When a locale other than  is selected, the set of characters qualifying for a particular character type test may be extended to include other implementation-defined characters. For example, implementations running in western Europe will likely include characters with diacritical marks, such as the umlaut, caret, and tilde. Therefore, whether ä, for example, tests true with  is implementation-defined, based on the current locale.

Many implementations use a character representation that has more bits than are needed to represent the host character set; for example, 8-bit character systems supporting 7-bit ASCII. However, such implementations often support an extended character set using the otherwise unused bit(s). Also, a C programmer is at liberty to treat a  as a small integer, storing into it any bit pattern that will fit.

When a  contains a bit-pattern that represents something other than the machine's native character set, it should not be passed to a   function, unless permitted by your current locale. Even then, the results are implementation-defined. Also, you should determine whether a  is signed or not because an 8-bit   containing 0x80, for example, might be treated quite differently when it is signed versus unsigned.

Standard C requires that all  functions actually be implemented as functions. They may also be implemented as macros provided it is guaranteed they are safe macros. That is, their arguments are only evaluated once. Standard C allows  on any   name to get at the corresponding function version.

Recommendation: The actual value returned by the character testing functions when the argument tests true is implementation-defined. Therefore, you should use a logical, rather than an arithmetic, test on such values.

Standard C reserves all function names beginning with  or   followed by a lowercase letter (followed by any other identifier characters) for future additions to the run-time library.

The following functions have locale-specific behavior:,  ,  ,  ,  ,  ,  ,  ,  , and.

Prior to C89, the following functions were widely available via this header:,  ,  , and. None of these are supported by Standard C.

C++ Consideration: The equivalent Standard C++ header is.

The Function
Use  rather than something like the following:

because in some character sets (EBCDIC, for example) the upper- and lowercase letter groups do not occupy a contiguous range of internal values.

The Function
C99 added this function.

The Function
See.

The Function
See.

The Function
In non- locales, the mapping from upper- to lowercase may not be one-for-one. For example, an uppercase letter might be represented as two lowercase letters taken together, or, perhaps, it may not even have a lowercase equivalent. Likewise, for.

= &lt;errno.h&gt; – Errors =

Historically,  was declared as an     variable; however, Standard C requires that   be a macro. (The macro could, however, expand to a call to a function of the same name.) Specifically,  is a macro that expands to a modifiable lvalue of type   . As such,  could be defined as something like , where the implementation-supplied function   returns a pointer to. ing  to try to get at the underlying object results in undefined behavior.

Various standard library functions are documented as setting  to a nonzero value when certain errors are detected. Standard C requires that this value be positive. It also states that no library routine is required to clear  (that is, giving it the value 0), and, certainly, you should never rely on a library routine doing so.

Historically, macros that define valid values for  have been named starting with. And although a set of names evolved for various systems, there was wide divergence on the spelling and meaning of some of those names. As a result, C89 defined only two macros:  and. C99 added. Additional macro definitions, beginning with  and a digit or an uppercase letter may be specified by a standards-conforming implementation.

Here are some common  extension macros:

See Optional Contents for requirements than an implementation may need to add to this header to support the annex called “Bounds-checking interfaces” added by C11.

C++ Consideration: The equivalent Standard C++ header is.

= &lt;fenv.h&gt; – Floating-Point Environment =

C99 added this header.

Standard C reserves all macro names beginning with  followed by an uppercase letter for future additions to this header.

C++ Consideration: The equivalent Standard C++ header is.

= &lt;float.h&gt; – Characteristics of floating types =

This header defines the floating-point characteristics of the target system via a series of macros whose values are largely implementation-defined.

As of C17, almost all the macros were defined in C89. The exceptions are  and , added in C99; and  ,  ,  ,  ,  ,  ,  ,  , and  , added in C11.

Although many systems use IEEE-754 format for floating-point types, when C89 was being developed, there were three other formats in common use, all of which were accommodated by C89.

Standard C defines the values  through   for the macro. All other values specify implementation-defined rounding behavior.

Standard C defines the values  through 2 for the macro. All other negative values for  characterize implementation-defined behavior. See Floating Constants regarding the possible impact of this macro’s value on floating constants.

C++ Consideration: The equivalent Standard C++ header is.

= &lt;inttypes.h&gt; – Format Conversion of Integer Types =

C99 added this header.

Standard C reserves all macro names beginning with  or   followed by a lowercase letter or   for future additions to this header.

C++ Consideration: The equivalent Standard C++ header is.

= &lt;iso646.h&gt; – Alternative Spellings =

C95 added this header.

C++ Consideration: The equivalent Standard C++ header is. The macros defined by Standard C in this header are keywords in Standard C++.

= &lt;limits.h&gt; – Numerical Limits =

This header defines the integer characteristics of the target system via a series of macros whose values are largely implementation-defined.

Almost all the macros were defined in C89. The exceptions are,  , and  , which were added in C99.

C++ Consideration: The equivalent Standard C++ header is.

= &lt;locale.h&gt; – Localization =

Almost all the members of the type    were defined by C89. The exceptions are,  ,  ,  ,  , and  , which were added in C99. Implementations may add other members.

Standard C has reserved the space of names beginning with  followed by an uppercase letter for use by implementations, so they may add extra locale subcategory macros.

The locales defined by Standard C are  and , the latter being the locale-specific native environment. All other the strings used to identify all other locales are implementation-defined.

C++ Consideration: The equivalent Standard C++ header is.

The Function
If you modify the contents of the string returned by, the behavior is undefined.

= &lt;math.h&gt; – Mathematics =

C99 added the types  and  ; the macros ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  , and  ; some function-like macros, and many functions. C99 also added the  pragma.

The macros  and   returned by some math function require.

In C89, the math function names created by adding a suffix of  or   were reserved for implementations of   and     versions, respectively. However, a conforming implementation was required to support only the  set. Starting with C99, all three versions must be provided.

In the case of the  set, these functions must be called in the presence of an appropriate prototype; otherwise,   arguments will be widened to. (Note though that specifying  in a prototype does not necessarily force such widening to be disabled; this aspect of prototypes is implementation-defined . However, it is necessary when supporting the   set.)

With the introduction of  in C99,   need not be set in certain circumstances.

A domain error occurs if an input argument is outside the domain over which the mathematical function is defined. In this case, an implementation-defined value is returned, and, prior to C99,  is set to the macro.

A range error occurs if the result of the function cannot be represented as a. If the result overflows, the function returns the value of, with the same sign as the correct value would have. Prior to C99,  is set to the macro. If the result underflows, the function returns 0 and  may or may not be set to , as the implementation defines.

C++ Consideration: The equivalent Standard C++ header is.

= &lt;setjmp.h&gt; – Non-Local Jumps =

Standard C requires that  be an array of suitable size to store the “current program context,” whatever that may be. C99 added that this context, “does not include the state of the floating-point status flags, of open files, or of any other component of the abstract machine.”

C++ Consideration: The equivalent Standard C++ header is.

The Macro
Standard C states, “It is unspecified whether  is a macro or an identifier declared with external linkage. If a macro definition is suppressed in order to access an actual function, or a program defines an external identifier with the name, the behavior is undefined .”

If  is invoked outside the contexts defined by Standard C, the behavior is undefined.

The Function
If  attempts to restore to a context that was never saved by , the result is undefined.

If  attempts to restore to a context and the parent function, which called   to save that context initially, has terminated, the results are undefined.

The behavior is undefined if  is invoked from a nested signal handler. Do not invoke  from an exit handler, such as those registered by the   function.

= &lt;signal.h&gt; – Signal Handling =

C89 added type.

Standard C reserves names of the form  and , where   represents the trailing part of an identifier that begins with an uppercase letter, for other kinds of signals. The complete set of signals available in a given implementation, their semantics, and their default handling is implementation-defined.

C++ Consideration: The equivalent Standard C++ header is.

The Function
returns a value equal to  if it cannot perform the requested operation. Prior to C89,  returned. Do not test explicitly for a  return value—use the macro   instead. Always test the return value from —do not assume it did exactly as you requested.

Usually, when a signal is detected and given off to a handler, that signal will be handled in the “default” manner when next it occurs. That is, you must explicitly call  to reset the signal mechanism from within the signal handler if you wish to continue to trap and handle the signal. (This is required by Standard C except in the case of, where it is implementation-defined as to whether the signal is reset automatically.)

If a call to  from within a handler returns , the value of   is indeterminate. In other circumstances,  is returned, and   contains a positive value whose possible values are implementation-defined.

During program startup, an implementation is at liberty to specify that selected signals be ignored or handled by default means as appropriate. That is, the initial state of signal handling is implementation-defined.

Standard C makes no statement about the behavior when a second signal for the same handler occurs before the first is processed.

= &lt;stdalign.h&gt; – Alignment =

C11 added this header.

C++ Consideration: The equivalent Standard C++ header is. Note that C++17 deprecated this header.

= &lt;stdarg.h&gt; – Variable Arguments =

This header was a C89 invention modeled closely on the UNIX  header. As Standard C uses a slightly different approach, the new header  was defined rather than retaining   with a changed meaning.

C++ Consideration: The equivalent Standard C++ header is.

The Macro
Standard C requires that  be a macro. If it is the subject of, and an actual function of the same name is used instead, the behavior is undefined. It is unspecified whether  is a macro or a function.

The Macro
C99 added this facility.

Standard C states, “It is unspecified whether  is a macro or identifier declared with external linkage. If a macro definition is suppressed in order to access an actual function, or a program defines an external identifier with the same name, the behavior is undefined .”

The Macro
Standard C states, “It is unspecified whether  is a macro or identifier declared with external linkage. If a macro definition is suppressed in order to access an actual function, or a program defines an external identifier with the same name, the behavior is undefined .”

The Macro
Standard C requires that  be a macro. If it is the subject of, and an actual function of the same name is used instead, the behavior is undefined.

If  is used with the second argument of , or that argument has type function or array, the behavior is undefined.

= &lt;stdatomic&gt; – Atomics =

C11 added this header.

Standard C reserves the following names for future addition to this header:

<ul> <li>

Macro names beginning with  followed by an uppercase letter

</li> <li>

Type names beginning with  or , followed by a lowercase letter

</li> <li>

For the  type, enumeration constants beginning with   followed by a lowercase letter

</li> <li>

Function names beginning with  followed by a lowercase letter

</li></ul>

C17 deprecated the use of the macro.

C++ Consideration: There is no equivalent header.

= &lt;stdbool.h&gt; – Boolean Type and Values =

C99 added the type specifier  and the corresponding header , which defines the type synonym   and the macros  ,  , and.

C++ Consideration: C++11 added, to emulate  ’s behavior. C++17 changed the header name to, as used by Standard C. However, note that C++17 deprecated this header.

How can we write code that uses a Boolean type and port it across multiple C compilers that do and don’t support this header, or to a C++ compiler? We never use the C99 type  and we don’t explicitly    ; we only ever use the names ,  , and. Here’s the relevant code to achieve this:

C++ Consideration: The equivalent Standard C++ header is.

= &lt;stddef.h&gt; – Common Definitions =

C89 added this header as a repository for several miscellaneous macro definitions and types. The macros are  and , and the types are  ,  , and. C11 added. All but  were C89 C inventions.

If the second argument to  is a bit-field, the behavior is undefined.

See Optional Contents for requirements than an implementation may need to add to this header to support the annex called “Bounds-checking interfaces” added by C11.

C++ Consideration: The equivalent Standard C++ header is.

= &lt;stdint.h&gt; – Integer Types =

C99 added this header.

Standard C reserves the following names for future addition to this header:

<ul> <li>

Macro names beginning with  or , and ending with  ,  , or

</li> <li>

Type names beginning with  or , and ending with

</li></ul>

C++ Consideration: The equivalent Standard C++ header is.

= &lt;stdio.h&gt; – Input/Output =

Files and File Systems
Many aspects of file and directory systems are implementation-dependent. So much so that Standard C cannot even make a statement about the most basic thing, a filename. Just what filenames can and must an implementation support? And as for directory and device names, there is nothing close to a common approach. And while there are standard header names, they need not map directly to filenames of the same spelling.

Some implementations may permit filenames to contain wildcards. That is, the file specifier may refer to a group of files using a convention such as  to refer to all files with a type of. None of the standard I/O routines is required to support such a notion.

Numerous operating systems can limit the number of open files on a per user basis. Note, too, that not all systems permit multiple versions of the same filename in the same directory, and this has consequences when you use  with   mode, for example.

Some file systems also place disk quotas on users such that an I/O operation may fail when a file grows too big—you may not know this until an output operation fails.

Back to the filename issue. After extensive investigation, the Standard C committee found that the format of a portable filename is up to six alphabetic characters followed by a period and none or one letter. And given that some file systems are case-sensitive, these alphabetic characters should all be the same case. However, rather than restrict yourself to filenames of the lowest common denominator, you can use conditional compilation directives to deal with platform-specific file systems.

The whole concept of filename redirection at the command-line level is also implementation-dependent. If possible, it means that  and , for example, may actually be dealing with devices other than the user's terminal. They could even be dealing with files. Note that  behaves slightly differently to   from , yet   could be reading from a file if   were redirected.

The details of file buffering, disk sector sizes, etc., are also implementation-dependent. However, Standard C requires an implementation to be able to handle text files with lines containing at least 254 characters, including the trailing new-line.

On some systems,,  , and   are special to the operating system and are maintained by it. On other systems, these may be established during program startup. Whether these files go against your maximum open file limit is implementation-dependent.

The macros,  ,  , and   expand to implementation-defined values.

See Optional Contents for requirements than an implementation may need to add to this header to support the annex called “Bounds-checking interfaces” added by C11.

C++ Consideration: The equivalent Standard C++ header is.

The Function
On many systems, the file is actually deleted. However, it may be that you are removing a synonym for a file's name, rather than deleting the file itself. In such cases, when the last synonym is being removed, the file is typically deleted.

If the file being removed is currently open, the behavior is implementation-defined. (In a shared file system, another program may be accessing the file you are removing.)

The Function
Standard C states that the old filename is removed (as if it had been the subject of a call to ). Presumably, this permits a filename synonym to be renamed as well. As  is removed, if   is currently open, the behavior is implementation-defined. (In a shared file system, another program may be accessing the file you are renaming.)

If a file called with the new name already exists, the behavior is implementation-defined.

A file system with a hierarchical (or other) directory structure might not directly permit renaming of files across directories. In these cases, the rename might fail, or the file might actually be copied and the original removed. Standard C hints that if a file copy is needed,  could fail; however, it does not require it to.

The Function
If the program terminates abnormally, the temporary file might not be removed.

The location and attributes (directory name, file name, access permission, etc.) of the file created are implementation-defined.

The Function
If you call  more than   times, the behavior is implementation-defined.

has no way to communicate an error so if you give it a non- address that points to an area smaller than   characters, the behavior is undefined.

While the filename is guaranteed to be unique at the time  was called, a file by that name may have been created before you get a chance to use it. If this is likely to be a problem, use  instead. And then if you need to open the file in a mode other than, use   or   to change it.

The filename may include directory information. If it does, the name and attributes of the directory are implementation-defined.

The Function
If a program terminates abnormally, there is no guarantee that streams open for output will have their buffers flushed.

On some implementations it may not be possible to close an empty file successfully and have it retained by the file system—you might first have to write something to it.

The Function
If the stream was not open for output, or it was open for update with the immediately previous operation being other than output, the behavior is undefined. However, some implementations permit input streams to be ed reliably.

If a program terminates abnormally, there is no guarantee that streams open for output will have their buffers flushed.

It is permissible to flush the “special” files  and. While Standard C states that flushing input files (including ) produces undefined behavior, some implementations permit it.

The Function
Some implementations may have difficulty seeking within text files; in which case, specifying mode  may also imply mode.

Some file systems permit only one version of a file by any given name; in which case, opening in  mode will cause that file to be overwritten. On other systems, a new version of the file may be created.

The set, and meaning, of mode characters following the sequences is implementation-defined. Other mode characters might be provided by your implementation to specify various file attributes.

C11 added exclusive mode .

Some file systems append trailing  characters to the end of a binary file when it is closed. Subsequently, when you open such files for append, you may be positioned beyond the end of the last character you wrote.

A file is opened with full buffering only if the implementation can determine that it is not an interactive device.

If  succeeds, it returns a   pointer to the opened stream. On failure, it returns. Note that an implementation may limit the number of currently open files— specifies the number permitted—in which case   will fail if you attempt to exceed this number. Standard C does not specify whether  is set.

The Function
If  succeeds, it returns the value of  ; otherwise, it returns. Standard C does not specify if  is set.

The Function
returns no value. The responsibility is on the programmer to make sure  points to an open file and that   is either   or a pointer to a sufficiently large buffer.

Standard C does not require an implementation to be able to implement each of these types of buffering. And so, an implementation is at liberty to treat one or more of these buffering types to be equivalent. Therefore, there is not guarantee that  will be able to honor your request, even though no error code can be returned.

The Function
may be one of the following:  (fully buffered),   (line buffered), or   (no buffering). Standard C requires that  accept these modes, although the underlying implementation need not be able to implement each of these types of buffering. And so, an implementation is at liberty to treat one or more of these buffering types to be equivalent.

When the programmer supplies the buffer, its contents are indeterminate at any particular time. (Standard C does not actually require an implementation to use the programmer's buffer if one is supplied.) The user-supplied buffer must remain in existence as long as the stream is open so take care if you use a buffer of class.

returns zero on success and nonzero on failure. A failure could result from an invalid value for  or for some other reason. Standard C does not specify that  is set on error.

The size of buffers allocated by  is implementation-defined although some implementations of   use   to determine the size of the internal buffer used as well.

The Function
Standard C defines the common output formatting behavior of the * family of functions under   with all other family member descriptions pointing there.

If there are insufficient arguments for the format, the behavior is undefined.

C89 added the conversion specifiers , , and . outputs the value of the  pointer using an implementation-defined format.

C99 added the conversion specifiers , , and  , and the length modifiers  ,  ,  ,  , and . It also added support for infinities and NaNs using an implementation-defined format.

If a conversion specification is invalid, the behavior is undefined. (Note that K&amp;R stated that any specification not recognized was treated as text and passed through to . For example,   produced  .) Standard C has reserved all unused lowercase conversion specifiers for its own use in future versions.

The behavior is undefined if any argument is, or points to, a union, a structure, or an array except for arrays with  and   pointers with.

Calling this function without having the appropriate prototype in scope results in undefined behavior.

The Function
Standard C defines the common input formatting behavior of the * family of functions under   with all other family member descriptions pointing there.

If there are insufficient arguments for the format, the behavior is undefined.

C89 added the conversion specifiers , , and . expects an argument of type pointer to, in an implementation-defined format.

C99 added the length modifiers , ,  ,  , and . It also added support for infinities and NaNs.

If a conversion specification is invalid, the behavior is undefined. Standard C has reserved all unused lowercase conversion specifiers for its own use in future versions.

If an error occurs,  is returned. Standard C makes no mention of  being set.

Calling this function without having the appropriate prototype in scope results in undefined behavior.

The Function
The output formatting issues mentioned in  also apply to this function.

Calling this function without having the appropriate prototype in scope results in undefined behavior.

maintains an internal buffer into which it builds the formatted string, and this buffer has a finite length. Historically, this length has been implementation-defined and not always documented, and implementations have varied widely. Standard C requires that an implementation be able to handle any single conversion of at least 509 characters.

The Function
The input formatting issues mentioned in  also apply to this function.

Calling this function without having the appropriate prototype in scope results in undefined behavior.

The Function
C99 added this function.

The output formatting issues mentioned in  also apply to this function.

Calling this function without having the appropriate prototype in scope results in undefined behavior.

The Function
The output formatting issues mentioned in  also apply to this function.

Calling this function without having the appropriate prototype in scope results in undefined behavior.

The Function
The input formatting issues mentioned in  also apply to this function.

Calling this function without having the appropriate prototype in scope results in undefined behavior.

The Function
C99 added this function.

The output formatting issues mentioned in  also apply to this function.

The Function
C99 added this function.

The input formatting issues mentioned in  also apply to this function.

The Function
C99 added this function.

The output formatting issues mentioned in  also apply to this function.

The Function
The input formatting issues mentioned in  also apply to this function.

The Function
C99 added this function.

The output formatting issues mentioned in  also apply to this function.

The Function
C99 added this function.

The output formatting issues mentioned in  also apply to this function.

The Function
C99 added this function.

The input formatting issues mentioned in  also apply to this function.

The Function
C11 removed this function.

The Function
C99 deprecated the use of this function at the beginning of a binary file.

The Function
If an error occurs, the file position indicator's value is indeterminate.

If a partial field is read, its value is indeterminate.

Standard C makes no statement about the possible translation of CR/LF pairs to new-lines on input, although some implementations do so for text files.

The Function
If an error occurs, the file position indicator's value is indeterminate.

Standard C makes no statement about the possible translation of new-lines to CR/LF pairs on output, although some implementations do so for text files.

The Function
C89 added this function.

On failure, a nonzero value is returned, and  is set to an implementation-defined positive value.

The Function
C89 added this function.

On failure, a nonzero value is returned, and  is set to an implementation-defined positive value.

The Function
The contents and format of the message are implementation-defined.

= &lt;stdlib.h&gt; – General Utilities =

C89 defined this header.

C99 added the type.

The macros  and   are Standard C inventions and are used as the implementation-defined success and failure exit code values used with.

Standard C reserves all function names beginning with  followed by a lowercase letter for future addition to this header.

See Optional Contents for requirements than an implementation may need to add to this header to support the annex called “Bounds-checking interfaces” added by C11.

C++ Consideration: The equivalent Standard C++ header is.

Numeric Conversion Functions
Standard C does not require,   and   to set   if an error occurs. If an error does occur, the behavior is undefined.

The Function
C99 added this function.

The Function
The format of the floating-point number is locale-specific.

The Function
C99 added this function.

The format of the floating-point number is locale-specific.

The Function
The format of the integral value is locale-specific.

The Function
C99 added this function.

The format of the integral value is locale-specific.

The Function
C99 added this function.

The format of the floating-point number is locale-specific.

The Function
The format of the integral value is locale-specific.

The Function
C99 added this function.

The format of the integral value is locale-specific.

The Function
Standard C requires  to be at least 32767.

Memory Management Functions
is returned if the space requested cannot be allocated. NEVER, EVER assume an allocation request succeeds without checking for a  return value.

If a zero amount of space is requested, the behavior is implementation-defined, and either  or a unique pointer is returned.

The size of the heap available and the details of its management and manipulation are implementation-specific.

The Function
C11 added this function.

The Function
The space allocated is initialized to “all-bits-zero.” Note that this is not guaranteed to be the same representation as floating-point zero or a null pointer.

The Function
If  is ,   does nothing. Otherwise, if  is not a value previously returned by one of these three allocation functions, the behavior is undefined.

The value of a pointer that refers to space that has been d is indeterminate, and such pointers should not be dereferenced.

Note that  has no way to communicate an error if one is detected.

The Function
The initial value of the space allocated is unspecified.

The Function
If  is ,   behaves like. Otherwise, if  is not a value previously returned by ,  , or  , the behavior is undefined. The same is true if  points to space that has been  d.

The Function
It is implementation-defined as to whether or not output streams are flushed, open streams are closed, or temporary files are removed.

The exit code of the program is some implementation-defined value that represents “failure.” It is generated by a call to  using the argument.

The Function
Standard C requires that at least 32 functions can be registered. However, to get around any limitations in this regard, you can always register just one function and have it call the others directly. This way, the other functions can also have argument lists and return values.

The Function
C11 added this function.

The Function
C99 added this function.

The Function
The environment list is maintained by the host environment, and the set of names available is implementation-specific.

The behavior is undefined if you attempt to modify the contents of the string pointed to by the return value.

Some implementations supply a third argument to, called. is an array of pointers to  (just like  ) with each pointer pointing to an environment string. Standard C does not include this.

The Function
C11 added this function.

The Function
Standard C does not require that a command-line processor (or equivalent) exist, in which case an implementation-defined value is returned. To ascertain whether such an environment exists, call  with a   argument; if a nonzero value is returned, a command-line processor is available.

The format of the string passed is implementation-defined.

The Function
If two members compare as equal, it is unspecified as to which member is matched.

The Function
If two members compare as equal, it is unspecified as to their order in the array.

The Function
The behavior is undefined if the result cannot be represented.

could be implemented as a macro.

The Function
If the result cannot be represented, the behavior is undefined.

C89 added this function.

The Function
The behavior is undefined if the result cannot be represented.

could be implemented as a macro.

The Function
If the result cannot be represented, the behavior is undefined.

C89 added this function.

The Function
C17 added this function.

The Function
If the result cannot be represented, the behavior is undefined.

C99 added this function.

Multibyte Character Functions
The behavior of these functions is subject to the current locale, in particular, to the  category.

Initial support for multibyte character processing was added by C89.

= &lt;stdnoreturn.h&gt; – _Noreturn =

C11 added this header.

C++ Consideration: There is no equivalent header.

= &lt;string.h&gt; – String Handling =

An implementation is at liberty to place certain alignment considerations on any of C's data types. Presumably, any copy you make in memory of such an aligned object should itself also be aligned appropriately. If this is not the case, it is possible that the created copy might not be accessible, or it may be misinterpreted. It is the programmer's responsibility to ensure the resultant object copy is in a format and memory location suitable for further and meaningful use.

Standard C reserves all function names beginning with,  , or   followed by a lowercase letter for future addition to this header.

See Optional Contents for requirements than an implementation may need to add to this header to support the annex called “Bounds-checking interfaces” added by C11.

C++ Consideration: The equivalent Standard C++ header is.

The Function
If the two strings overlap, the behavior is undefined.

The Function
C89 added this function.

The Function
If the two strings overlap, the behavior is undefined.

The Function
If the two strings overlap, the behavior is undefined.

The Function
If the two strings overlap, the behavior is undefined.

The Function
If the two strings overlap, the behavior is undefined.

Comparison Functions
Recommendation: All the comparison functions return an integer indicating less than, greater than, or equal to zero. Do not assume the positive or negative values indicating greater than and less than, respectively, have any predictable value. Always compare the return value against zero, never against a specific nonzero value.

The Function
The comparison is locale-specific.

C89 added this function.

The Function
C89 added this function.

The Function
C89 added this function.

The Function
The contents of the text of the message returned is implementation-defined.

The programmer should not attempt to write to the location pointed to by the returned value.

= &lt;tgmath.h&gt; – Type-Generic Math =

C99 added this header.

C++ Consideration: The equivalent Standard C++ header is. Note that C++17 deprecated this header.

= &lt;threads.h&gt; – Threads =

C11 added this header.

If a C implementation supports the keyword  (see the conditionally defined macro   mentioned in Conditionally Defined Standard Macros), it will also provide the header   As such, rather than using the keyword directly, do the following:

where  is a macro defined in that header as , and that matches the equivalent C++ keyword.

Standard C reserves function names, type names, and enumeration constants beginning with,  ,  , or  , followed by a lowercase letter, as possible additions to this header.

C++ Consideration: There is no equivalent header.

= &lt;time.h&gt; – Date and Time =

Standard C reserves all macro names beginning with  followed by an uppercase letter for future addition to this header.

See Optional Contents for requirements than an implementation may need to add to this header to support the annex called “Bounds-checking interfaces” added by C11.

C++ Consideration: The equivalent Standard C++ header is.

Components of Time
C99 replaced the macro  with.

C11 added the macro, and the type.

C11 added the members  and   to.

The Function
C89 added this function.

The Function
C89 added this function.

The Function
C11 added this function.

The Function
C89 added this function.

C99 added the following conversion specifiers:,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  ,  , and .

= &lt;uchar.h&gt; – Unicode Utilities =

C11 added this header.

C++ Consideration: The equivalent Standard C++ header is.

= &lt;wchar.h&gt; – Extended Multibyte and Wide Character Utilities =

C95 added this header.

Standard C reserves all function names beginning with  followed by a lowercase letter for future addition to this header.

See Optional Contents for requirements than an implementation may need to add to this header to support the annex called “Bounds-checking interfaces” added by C11.

C++ Consideration: The equivalent Standard C++ header is.

= &lt;wctype.h&gt; – Wide Character Classification and Mapping Utilities =

C95 added this header.

Standard C reserves all function names beginning with  or   followed by a lowercase letter for future addition to this header.

C++ Consideration: The equivalent Standard C++ header is.