C Programming/Variables

Like most programming languages, C uses and processes variables. In C, variables are human-readable names for the computer's memory addresses used by a running program. Variables make it easier to store, read and change the data within the computer's memory by allowing you to associate easy-to-remember labels for the memory addresses that store your program's data. The memory addresses associated with variables aren't determined until after the program is compiled and running on the computer.

At first, it's easiest to imagine variables as placeholders for values, much like in mathematics. You can think of a variable as being equivalent to its assigned value. So, if you have a variable i that is initialized (set equal) to 4, then it follows that i + 1 will equal 5. However, a skilled C programmer is more mindful of the invisible layer of abstraction going on just under the hood: that a variable is a stand-in for the memory address where the data can be found, not the data itself. You will gain more clarity on this point when you learn about pointers.

Since C is a relatively low-level programming language, before a C program can utilize memory to store a variable it must claim the memory needed to store the values for a variable. This is done by declaring variables. Declaring variables is the way in which a C program shows the number of variables it needs, what they are going to be named, and how much memory they will need.

Within the C programming language, when managing and working with variables, it is important to know the type of variables and the size of these types. A type’s size is the amount of computer memory required to store one value of this type. Since C is a fairly low-level programming language, the size of types can be specific to the hardware and compiler used – that is, how the language is made to work on one type of machine can be different from how it is made to work on another.

All variables in C are typed. That is, every variable declared must be assigned as a certain type of variable.

Declaring, Initializing, and Assigning Variables
Here is an example of declaring an integer, which we've called some_number. (Note the semicolon at the end of the line; that is how your compiler separates one program statement from another.)

This statement tells the compiler to create a variable called  and associate it with a memory location on the computer. We also tell the compiler the type of data that will be stored at that address, in this case an integer. Note that in C we must specify the type of data that a variable will store. This lets the compiler know how much total memory to set aside for the data (on most modern machines an  is 4 bytes in length). We'll look at other data types in the next section.

Multiple variables can be declared with one statement, like this:

In early versions of C, variables had to be declared at the beginning of a block. In C99 it is allowed to mix declarations and statements arbitrarily – but doing so is not usual, because it is rarely necessary, some compilers still don’t support C99 (portability), and it may, because it is uncommon yet, irritate fellow programmers (maintainability).

After declaring variables, you can assign a value to a variable later on using a statement like this: The assignment of a value to a variable is called initialization. The above statement directs the compiler to insert an integer representation of the number "3" into the memory address associated with. We can save a bit of typing by declaring and assigning data to a memory address at the same time:

You can also assign variables to the value of other variable, like so: Or assign multiple variables the same value with one statement: This is because the assignment x = y returns the value of the assignment, y. For example, some_number = 4 returns 4. That said, x = y = z  is really a shorthand for x = (y = z).

Naming Variables
Variable names in C are made up of letters (upper and lower case) and digits. The underscore character ("_") is also permitted. Names must not begin with a digit. Unlike some languages (such as Perl and some BASIC dialects), C does not use any special prefix characters on variable names.

Some examples of valid (but not very descriptive) C variable names: Some examples of invalid C variable names: As the last example suggests, certain words are reserved as keywords in the language, and these cannot be used as variable names.

It is not allowed to use the same name for multiple variables in the same scope. When working with other developers, you should therefore take steps to avoid using the same name for global variables or function names. Some large projects adhere to naming guidelines to avoid duplicate names and for consistency.

In addition there are certain sets of names that, while not language keywords, are reserved for one reason or another. For example, a C compiler might use certain names "behind the scenes", and this might cause problems for a program that attempts to use them. Also, some names are reserved for possible future use in the C standard library. The rules for determining exactly what names are reserved (and in what contexts they are reserved) are too complicated to describe here, and as a beginner you don't need to worry about them much anyway. For now, just avoid using names that begin with an underscore character.

The naming rules for C variables also apply to naming other language constructs such as function names, struct tags, and macros, all of which will be covered later.

Literals
Anytime within a program in which you specify a value explicitly instead of referring to a variable or some other form of data, that value is referred to as a literal. In the initialization example above, 3 is a literal. Literals can either take a form defined by their type (more on that soon), or one can use hexadecimal (hex) notation to directly insert data into a variable regardless of its type. Hex numbers are always preceded with 0x. For now, though, you probably shouldn't be too concerned with hex.

The Four Basic Data Types
In Standard C there are four basic data types. They are,  ,  , and.

The type
The int type stores integers in the form of "whole numbers". An integer is typically the size of one machine word, which on most modern home PCs is 32 bits (4 octets). Examples of literals are whole numbers (integers) such as 1, 2, 3, 10, 100... When int is 32 bits (4 octets), it can store any whole number (integer) between -2147483648 and 2147483647. A 32 bit word (number) has the possibility of representing any one number out of 4294967296 possibilities (2 to the power of 32).

If you want to declare a new int variable, use the <tt>int</tt> keyword. For example:

In this declaration we declare 3 variables, numberOfStudents, i and j, j here is assigned the literal 5.

The type
The  type is capable of holding any member of the execution character set. It stores the same kind of data as an  (i.e. integers), but typically has a size of one byte. The size of a byte is specified by the macro  which specifies the number of bits in a char (byte). In standard C it never can be less than 8 bits. A variable of type  is most often used to store character data, hence its name. Most implementations use the ASCII character set as the execution character set, but it's best not to know or care about that unless the actual values are important.

Examples of character literals are 'a', 'b', '1', etc., as well as some special characters such as ' ' (the null character) and ' ' (newline, recall "Hello, World"). Note that the  value must be enclosed within single quotations.

When we initialize a character variable, we can do it two ways. One is preferred, the other way is bad programming practice.

The first way is to write:

This is good programming practice in that it allows a person reading your code to understand that letter1 is being initialized with the letter 'a' to start off with.

The second way, which should not be used when you are coding letter characters, is to write:

This is considered by some to be extremely bad practice, if we are using it to store a character, not a small number, in that if someone reads your code, most readers are forced to look up what character corresponds with the number 97 in the encoding scheme. In the end,  and   store both the same thing – the letter 'a', but the first method is clearer, easier to debug, and much more straightforward.

One important thing to mention is that characters for numerals are represented differently from their corresponding number, i.e. '1' is not equal to 1. In short, any single entry that is enclosed within 'single quotes'.

There is one more kind of literal that needs to be explained in connection with chars: the string literal. A string is a series of characters, usually intended to be displayed. They are surrounded by double quotations (" ", not ' '). An example of a string literal is the "Hello, World!\n" in the "Hello, World" example.

The string literal is assigned to a character array, arrays are described later. Example:

The type
is short for floating point. It stores inexact representations of real numbers, both integer and non-integer values. It can be used with numbers that are much greater than the greatest possible. literals must be suffixed with F or f. Examples are: 3.1415926f, 4.0f, 6.022e+23f.

It is important to note that floating-point numbers are inexact. Some numbers like 0.1f cannot be represented exactly as s but will have a small error. Very large and very small numbers will have less precision and arithmetic operations are sometimes not associative or distributive because of a lack of precision. Nonetheless, floating-point numbers are most commonly used for approximating real numbers and operations on them are efficient on modern microprocessors. Floating-point arithmetic is explained in more detail on Wikipedia.

variables can be declared using the <tt>float</tt> keyword. A  is only one machine word in size. Therefore, it is used when less precision than a double provides is required.

The type
The <tt>double</tt> and <tt>float</tt> types are very similar. The <tt>float</tt> type allows you to store single-precision floating point numbers, while the <tt>double</tt> keyword allows you to store double-precision floating point numbers – real numbers, in other words. Its size is typically two machine words, or 8 bytes on most machines. Examples of <tt>double</tt> literals are 3.1415926535897932, 4.0, 6.022e+23 (scientific notation). If you use 4 instead of 4.0, the 4 will be interpreted as an <tt>int</tt>.

The distinction between floats and doubles was made because of the differing sizes of the two types. When C was first used, space was at a minimum and so the judicious use of a float instead of a double saved some memory. Nowadays, with memory more freely available, you rarely need to conserve memory like this – it may be better to use doubles consistently. Indeed, some C implementations use doubles instead of floats when you declare a float variable.

If you want to use a double variable, use the <tt>double</tt> keyword.

<tt>sizeof</tt>
If you have any doubts as to the amount of memory actually used by any variable (and this goes for types we'll discuss later, also), you can use the <tt>sizeof</tt> operator to find out for sure. (For completeness, it is important to mention that <tt>sizeof</tt> is a unary operator, not a function.) Its syntax is:

The two expressions above return the size of the object and type specified, in bytes. The return type is <tt>size_t</tt> (defined in the header <tt>&lt;stddef.h&gt;</tt>) which is an unsigned value. Here's an example usage:

<tt>size</tt> will be set to 4, assuming <tt>CHAR_BIT</tt> is defined as 8, and an integer is 32 bits wide. The value of <tt>sizeof</tt>'s result is the number of bytes.

Note that when <tt>sizeof</tt> is applied to a <tt>char</tt>, the result is 1; that is:

always returns 1.

Data type modifiers
One can alter the data storage of any data type by preceding it with certain modifiers.

<tt>long</tt> and <tt>short</tt> are modifiers that make it possible for a data type to use either more or less memory. The <tt>int</tt> keyword need not follow the <tt>short</tt> and <tt>long</tt> keywords. This is most commonly the case. A <tt>short</tt> can be used where the values fall within a lesser range than that of an <tt>int</tt>, typically -32768 to 32767. A <tt>long</tt> can be used to contain an extended range of values. It is not guaranteed that a <tt>short</tt> uses less memory than an <tt>int</tt>, nor is it guaranteed that a <tt>long</tt> takes up more memory than an <tt>int</tt>. It is only guaranteed that sizeof(short) <= sizeof(int) <= sizeof(long). Typically a <tt>short</tt> is 2 bytes, an <tt>int</tt> is 4 bytes, and a <tt>long</tt> either 4 or 8 bytes. Modern C compilers also provide <tt>long long</tt> which is typically an 8 byte integer.

In all of the types described above, one bit is used to indicate the sign (positive or negative) of a value. If you decide that a variable will never hold a negative value, you may use the <tt>unsigned</tt> modifier to use that one bit for storing other data, effectively doubling the range of values while mandating that those values be positive. The <tt>unsigned</tt> specifier also may be used without a trailing <tt>int</tt>, in which case the size defaults to that of an <tt>int</tt>. There is also a <tt>signed</tt> modifier which is the opposite, but it is not necessary, except for certain uses of <tt>char</tt>, and seldom used since all types (except <tt>char</tt>) are signed by default.

The <tt>long</tt> modifier can also be used with <tt>double</tt> to create a <tt>long double</tt> type. This floating-point type may (but is not required to) have greater precision than the <tt>double</tt> type.

To use a modifier, just declare a variable with the data type and relevant modifiers:

<tt>const</tt> qualifier
When the <tt>const</tt> qualifier is used, the declared variable must be initialized at declaration. It is then not allowed to be changed.

While the idea of a variable that never changes may not seem useful, there are good reasons to use <tt>const</tt>. For one thing, many compilers can perform some small optimizations on data when it knows that data will never change. For example, if you need the value of &pi; in your calculations, you can declare a const variable of <tt>pi</tt>, so a program or another function written by someone else cannot change the value of <tt>pi</tt>.

Note that a Standard conforming compiler must issue a warning if an attempt is made to change a <tt>const</tt> variable - but after doing so the compiler is free to ignore the <tt>const</tt> qualifier.

Magic numbers
When you write C programs, you may be tempted to write code that will depend on certain numbers. For example, you may be writing a program for a grocery store. This complex program has thousands upon thousands of lines of code. The programmer decides to represent the cost of a can of corn, currently 99 cents, as a literal throughout the code. Now, assume the cost of a can of corn changes to 89 cents. The programmer must now go in and manually change each entry of 99 cents to 89. While this is not that big a problem, considering the "global find-replace" function of many text editors, consider another problem: the cost of a can of green beans is also initially 99 cents. To reliably change the price, you have to look at every occurrence of the number 99.

C possesses certain functionality to avoid this. This functionality is approximately equivalent, though one method can be useful in one circumstance, over another.

Using the <tt>const</tt> keyword
The <tt>const</tt> keyword helps eradicate magic numbers. By declaring a variable <tt>const corn</tt> at the beginning of a block, a programmer can simply change that const and not have to worry about setting the value elsewhere.

There is also another method for avoiding magic numbers. It is much more flexible than <tt>const</tt>, and also much more problematic in many ways. It also involves the preprocessor, as opposed to the compiler. Behold...

<tt>#define</tt>
When you write programs, you can create what is known as a macro, so when the computer is reading your code, it will replace all instances of a word with the specified expression.

Here's an example. If you write when you want to, for example, print the price of corn, you use the word  instead of the number 0.99 – the preprocessor will replace all instances of   with 0.99, which the compiler will interpret as the literal   0.99. The preprocessor performs substitution, that is,  is replaced by 0.99 so this means there is no need for a semicolon.

It is important to note that  has basically the same functionality as the "find-and-replace" function in a lot of text editors/word processors.

For some purposes,  can be harmfully used, and it is usually preferable to use   if   is unnecessary. It is possible, for instance, to, say, a macro   as the number 3, but if you try to print the macro, thinking that   represents a string that you can show on the screen, the program will have an error. also has no regard for type. It disregards the structure of your program, replacing the text everywhere (in effect, disregarding scope), which could be advantageous in some circumstances, but can be the source of problematic bugs.

You will see further instances of the  directive later in the text. It is good convention to write d words in all capitals, so a programmer will know that this is not a variable that you have declared but a  d macro. It is not necessary to end a preprocessor directive such as  with a semicolon; in fact, some compilers may warn you about unnecessary tokens in your code if you do.

Scope
In the Basic Concepts section, the concept of scope was introduced. It is important to revisit the distinction between local types and global types, and how to declare variables of each. To declare a local variable, you place the declaration at the beginning (i.e. before any non-declarative statements) of the block to which the variable is deemed to be local. To declare a global variable, declare the variable outside of any block. If a variable is global, it can be read, and written, from anywhere in your program.

Global variables are not considered good programming practice, and should be avoided whenever possible. They inhibit code readability, create naming conflicts, waste memory, and can create difficult-to-trace bugs. Excessive usage of globals is usually a sign of laziness or poor design. However, if there is a situation where local variables may create more obtuse and unreadable code, there's no shame in using globals.

Other Modifiers
Included here, for completeness, are more of the modifiers that standard C provides. For the beginning programmer, static and extern may be useful. volatile is more of interest to advanced programmers. register and auto are largely deprecated and are generally not of interest to either beginning or advanced programmers.

static
<tt>static</tt> is sometimes a useful keyword. It is a common misbelief that the only purpose is to make a variable stay in memory.

When you declare a function or global variable as static, you cannot access the function or variable through the extern (see below) keyword from other files in your project. This is called static linkage.

When you declare a local variable as static, it is created just like any other variable. However, when the variable goes out of scope (i.e. the block it was local to is finished) the variable stays in memory, retaining its value. The variable stays in memory until the program ends. While this behaviour resembles that of global variables, static variables still obey scope rules and therefore cannot be accessed outside of their scope. This is called static storage duration.

Variables declared static are initialized to zero (or for pointers, NULL ) by default. They can be initialized explicitly on declaration to any constant value. The initialization is made just once, at compile time.

You can use static in (at least) two different ways. Consider this code, and imagine it is in a file called jfile.c:

The  variable is accessible by both up and down and retains its value. The  variables also retain their value, but they are two different variables, one in each of their scopes. Static variables are a good way to implement encapsulation, a term from the object-oriented way of thinking that effectively means not allowing changes to be made to a variable except through function calls.

Running the program above will produce the following output:

Features of  variables : 1. Keyword used       - static 2. Storage            - Memory 3. Default value      - Zero 4. Scope              - Local to the block in which it is declared 5. Lifetime           - Value persists between different function calls 6. Keyword optionality - Mandatory to use the keyword

extern
<tt>extern</tt> is used when a file needs to access a variable in another file that it may not have <tt>#include</tt>d directly. Therefore, extern does not allocate memory for the new variable, it just provides the compiler with sufficient information to access a variable declared in another file.

Features of  variable : 1. Keyword used       - extern 2. Storage            - Memory 3. Default value      - Zero 4. Scope              - Global (all over the program) 5. Lifetime           - Value persists till the program's execution comes to an end 6. Keyword optionality - Optional if declared outside all the functions

volatile
<tt>volatile</tt> is a special type of modifier which informs the compiler that the value of the variable may be changed by external entities other than the program itself. This is necessary for certain programs compiled with optimizations – if a variable were not defined <tt>volatile</tt> then the compiler may assume that certain operations involving the variable are safe to optimize away when in fact they aren't. volatile is particularly relevant when working with embedded systems (where a program may not have complete control of a variable) and multi-threaded applications.

auto
<tt>auto</tt> is a modifier which specifies an "automatic" variable that is automatically created when in scope and destroyed when out of scope. If you think this sounds like pretty much what you've been doing all along when you declare a variable, you're right: all declared items within a block are implicitly "automatic". For this reason, the auto keyword is more like the answer to a trivia question than a useful modifier, and there are lots of very competent programmers that are unaware of its existence.

Features of  variables : 1. Keyword used       - auto 2. Storage            - Memory 3. Default value      - Garbage value (random value) 4. Scope              - Local to the block in which it is defined 5. Lifetime           - Value persists while the control remains within the block 6. Keyword optionality - Optional

register
<tt>register</tt> is a hint to the compiler to attempt to optimize the storage of the given variable by storing it in a register of the computer's CPU when the program is run. Most optimizing compilers do this anyway, so use of this keyword is often unnecessary. In fact, ANSI C states that a compiler can ignore this keyword if it so desires – and many do. Microsoft Visual C++ is an example of an implementation that completely ignores the register keyword.

Features of  variables : 1. Keyword used       - register 2. Storage            - CPU registers (values can be retrieved faster than from memory) 3. Default value      - Garbage value 4. Scope              - Local to the block in which it is defined 5. Lifetime           - Value persists while the control remains within the block 6. Keyword optionality - Mandatory to use the keyword

Concepts

 * Variables
 * Types
 * Data Structures
 * Arrays

In this section

 * C variables
 * C arrays