C++ Programming/Programming Languages/C++/Code/Statements/Variables/Type

Type
So far we explained that internally data is stored in a way the hardware can read as zeros and ones, bits. That data is conceptually divided and labeled in accordance to the number of bits in each set. We must explain that since data can be interpreted in a variety of sets according to established formats as to represent meaningful information. This ultimately required that the programmer is capable of differentiate to the compiler what is needed, this is done by using the different types.

A variable can refer to simple values like integers called a primitive type or to a set of values called a composite type that are made up of primitive types and other composite types. Types consist of a set of valid values and a set of valid operations which can be performed on these values. A variable must declare what type it is before it can be used in order to enforce value and operation safety and to know how much space is needed to store a value.

Major functions that type systems provide are:
 * Safety - types make it impossible to code some operations which cannot be valid in a certain context. This mechanism effectively catches the majority of common mistakes made by programmers. For example, an expression "Hello, Wikipedia"/1 is invalid because a string literal cannot be divided by an integer in the usual sense. As discussed below, strong typing offers more safety, but it does not necessarily guarantee complete safety (see type-safety for more information).
 * Optimization - static type checking might provide useful information to a compiler. For example, if a type says a value is aligned at a multiple of 4, the memory access can be optimized.
 * Documentation - using types in languages also improves documentation of code. For example, the declaration of a variable as being of a specific type documents how the variable is used. In fact, many languages allow programmers to define semantic types derived from primitive types; either composed of elements of one or more primitive types, or simply as aliases for names of primitive types.
 * Abstraction - types allow programmers to think about programs in higher level, not bothering with low-level implementation. For example, programmers can think of strings as values instead of a mere array of bytes.
 * Modularity - types allow programmers to express the interface between two subsystems. This localizes the definitions required for interoperability of the subsystems and prevents inconsistencies when those subsystems communicate.

Standard types
There are five basic primitive types called standard types, specified by particular keywords, that store a single value. These types stand isolated from the complexities of class type variables, even if the syntax of utilization at times brings them all in line, standard types do not share class properties (i.e.: don't have a constructor).

The type of a variable determines what kind of values it can store:


 * bool - a boolean value: true; false
 * int - Integer: -5; 10; 100
 * char - a character in some encoding, often something like ASCII, ISO-8859-1 ("Latin 1") or ISO-8859-15: 'a', '=', 'G', '2'.
 * float - floating-point number: 1.25; -2.35*10^23
 * double - double-precision floating-point number: like float but more decimals

The float and double primitive data types are called 'floating point' types and are used to represent real numbers (numbers with decimal places, like 1.435324 and 853.562). Floating point numbers and floating point arithmetic can be very tricky, due to the nature of how a computer calculates floating point numbers.

Definition vs. declaration
There is an important concept, the distinction between the declaration of a variable and its definition, two separated steps involved in the use of variables. The declaration announces the properties (the type, size, etc.), on the other hand the definition causes storage to be allocated in accordance to the declaration.

Variables as function, classes and other constructs that require declarations may be declared many times, but each may only be defined one time.

This concept will be further explained and with some particulars noted (such as ) as we introduce other components. Here are some examples, some include concepts not yet introduced, but will give you a broader view:

Declaration
C++ is a statically typed language. Hence, any variable cannot be used without specifying its type. This is why the type figures in the declaration. This way the compiler can protect you from trying to store a value of an incompatible type into a variable, e.g. storing a string in an integer variable. Declaring variables before use also allows spelling errors to be easily detected. Consider a variable used in many statements, but misspelled in one of them. Without declarations, the compiler would silently assume that the misspelled variable actually refers to some other variable. With declarations, an "Undeclared Variable" error would be flagged. Another reason for specifying the type of the variable is so the compiler knows how much space in memory must be allocated for this variable.

The simplest variable declarations look like this (the parts in []s are optional):

[specifier(s)] type variable_name [ = initial_value];

To create an integer variable for example, the syntax is

where sum is the name you made up for the variable. This kind of statement is called a declaration. It declares <tt>sum</tt> as a variable of type <tt>int</tt>, so that <tt>sum</tt> can store an integer value. Every variable has to be declared before use and it is common practice to declare variables as close as possible to the moment where they are needed. This is unlike languages, such as C, where all declarations must precede all other statements and expressions.

In general, you will want to make up variable names that indicate what you plan to do with the variable. For example, if you saw these variable declarations:

you could probably make a good guess at what values would be stored in them. This example also demonstrates the syntax for declaring multiple variables with the same type in the same statement: hour and minute are both integers (int type). Notice how a comma separates the variable names.

Those lines also declare variables, but this time the variables are initialized to some value. What this means is that not only is space allocated for the variables but the space is also filled with the given value. The two lines illustrate two different but equivalent ways to initialize a variable. The assignment operator '=' in a declaration has a subtle distinction in that it assigns an initial value instead of assigning a new value. The distinction becomes important especially when the values we are dealing with are not of simple types like integers but more complex objects like the input and output streams provided by the <tt>iostream</tt> class.

The expression used to initialize a variable need not be constant. So the lines:

can be combined as:

or:

Declare a floating point variable 'f' with an initial value of 1.5:

Floating point constants should always have a '.' (decimal point) somewhere in them. Any number that does not have a decimal point is interpreted as an integer, which then must be converted to a floating point value before it is used.

For example:

will not set a to 2.5 because 5 and 2 are integers and integer arithmetic will apply for the division, cutting off the fractional part. A correct way to do this would be:

You can also declare floating point values using scientific notation. The constant .05 in scientific notation would be $$5 \times 10^{-2}$$. The syntax for this is the base, followed by an e, followed by the exponent. For example, to use .05 as a scientific notation constant:

Below is a program storing two values in integer variables, adding them and displaying the result:

or, if you like to save some space, the same above statement can be written as:

Modifiers
There are several modifiers that can be applied to data types to change the range of numbers they can represent.

const
A variable declared with this specifier cannot be changed (as in read only). Either local or class-level variables (scope) may be declared <tt>const</tt> indicating that you don't intend to change their value after they're initialized. You declare a variable as being constant using the <tt>const</tt> keyword. Global <tt>const</tt> variables have static linkage. If you need to use a global constant across multiple files the best option is to use a special header file that can be included across the project.

declares a positive integer constant, called <tt>DAYS_IN_WEEK</tt>, with the value 7. Because this value cannot be changed, you must give it a value when you declare it. If you later try to assign another value to a constant variable, the compiler will print an error.

The full meaning of <tt>const</tt> is more complicated than this; when working through pointers or references, <tt>const</tt> can be applied to mean that the object pointed (or referred) to will not be changed via that pointer or reference. There may be other names for the object, and it may still be changed using one of those names so long as it was not originally defined as being truly <tt>const</tt>.

It has an advantage for programmers over <tt>#define</tt> command because it is understood by the compiler, not just substituted into the program text by the preprocessor, so any error messages can be much more helpful.

With pointers it can get messy...

If the pointer is a local, having a <tt>const</tt> pointer is useless. The order of T and const can be reversed:

is the same as

volatile
A hint to the compiler that a variable's value can be changed externally; therefore the compiler must avoid aggressive optimization on any code that uses the variable.

Unlike in Java, C++'s <tt>volatile</tt> specifier does not have any meaning in relation to multi-threading. Standard C++ does not include support for multi-threading (though it is a common extension) and so variables needing to be synchronized between threads need a synchronization mechanisms such as mutexes to be employed, keep in mind that <tt>volatile</tt> implies only safety in the presence of implicit or unpredictable actions by the same thread (or by a signal handler in the case of a volatile sigatomic_t object). Accesses to <tt>mutable volatile</tt> variables and fields are viewed as synchronization operations by most compilers and can affect control flow and thus determine whether or not other shared variables are accessed, this implies that in general ordinary memory operations cannot be reordered with respect to a mutable volatile access. This also means that mutable volatile accesses are sequentially consistent. This is not (as yet) part of the standard, it is under discussion and should be avoided until it gets defined.

mutable
This specifier may only be applied to a non-static, non-const member variables. It allows the variable to be modified within const member functions.

mutable is usually used when an object might be logically constant, i.e., no outside observable behavior changes, but not bitwise const, i.e. some internal member might change state.

The canonical example is the proxy pattern. Suppose you have created an image catalog application that shows all images in a long, scrolling list. This list could be modeled as:

Note that for the image class, bitwise const and logically const is the same: If m_data changes, the public function data returns different output.

At a given time, most of those images will not be shown, and might never be needed. To avoid having the user wait for a lot of data being loaded which might never be needed, the proxy pattern might be invoked:

Note that the image_proxy does not change observable state when <tt>data</tt> is invoked: it is logically constant. However, it is not bitwise constant since <tt>m_image</tt> changes the first time <tt>data</tt> is invoked. This is made possible by declaring <tt>m_image</tt> mutable. If it had not been declared mutable, the <tt>image_proxy::data</tt> would not compile, since <tt>m_image</tt> is assigned to within a constant function.

short
The <tt>short</tt> specifier can be applied to the <tt>int</tt> data type. It can decrease the number of bytes used by the variable, which decreases the range of numbers that the variable can represent. Typically, a <tt>short int</tt> is half the size of a regular <tt>int</tt> -- but this will be different depending on the compiler and the system that you use. When you use the <tt>short</tt> specifier, the <tt>int</tt> type is implicit. For example:

is equivalent to:

long
The <tt>long</tt> specifier can be applied to the <tt>int</tt> and <tt>double</tt> data types. It can increase the number of bytes used by the variable, which increases the range of numbers that the variable can represent. A <tt>long int</tt> is typically twice the size of an <tt>int</tt>, and a <tt>long double</tt> can represent larger numbers more precisely. When you use <tt>long</tt> by itself, the <tt>int</tt> type is implied. For example:

is equivalent to:

The shorter form, with the <tt>int</tt> implied rather than stated, is more idiomatic (i.e., seems more natural to experienced C++ programmers).

Use the <tt>long</tt> specifier when you need to store larger numbers in your variables. Be aware, however, that on some compilers and systems the long specifier may not increase the size of a variable. Indeed, most common 32-bit platforms (and one 64-bit platform) use 32 bits for <tt>int</tt> and also 32 bits for <tt>long int</tt>.

signed
The <tt>signed</tt> specifier makes a variable represent both positive and negative numbers. It can be applied only to the <tt>char</tt>, <tt>int</tt> and <tt>long</tt> data types. The <tt>signed</tt> specifier is applied by default for <tt>int</tt> and <tt>long</tt>, so you typically will never use it in your code.

Enumerated data type
In programming it is often necessary to deal with data types that describe a fixed set of alternatives. For example, when designing a program to play a card game it is necessary to keep track of the suit of an individual card.

One method for doing this may be to create unique constants to keep track of the suit. For example one could define

Unfortunately there are several problems with this method. The most minor problem is that this can be a bit cumbersome to write. A more serious problem is that this data is indistinguishable from integers. It becomes very easy to start using the associated numbers instead of the suits themselves. Such as:

...and worse to make mistakes that may be very difficult to catch such as a typo...

...which produces a valid expression in C++, but would be meaningless in representing the card's suit.

One way around these difficulty is to create a new data type specifically designed to keep track of the suit of the card, and restricts you to only use valid possibilities. We can accomplish this using an enumerated data type using the C++ keyword.

Type conversion
Type conversion or typecasting refers to changing an entity of one data type into another.

Implicit type conversion
Implicit type conversion, also known as coercion, is an automatic and temporary type conversion by the compiler. In a mixed-type expression, data of one or more subtypes can be converted to a supertype as needed at runtime so that the program will run correctly.

For example:

As you can see,   and   belong to different data types, the compiler will then automatically and temporarily converted the original types to equal data types each time a comparison or assignment is executed.

Explicit type conversion
Explicit type conversion manually converts one type into another, and is used in cases where automatic type casting doesn't occur.

In this example, d would normally be a double and would be passed to the printf function as such. This would result in unexpected behavior, since printf would try to look for an int. The typecast in the example corrects this, and passes the integer to printf as expected.