C++ Programming/Programming Languages/C++/Code/Statements/Variables/Operators

Operators
Now that we have covered the variables and data types it becomes possible to introduce operators. Operators are special symbols that are used to represent and direct simple computations. They have significant importance in programming since they serve to define actions and simple interactions with data in a direct, non-abstract way.

Since computers are mathematical devices, compilers and interpreters require a full syntactic theory of all operations in order to correctly parse formulas involving combinations of symbols. In particular, they depend on operator precedence rules just as mathematical writing depends on order of operations. Conventionally, the computing usage of operator also goes beyond the mathematical usage (for functions).

C++, like all programming languages, uses a set of operators. They are subdivided into several groups:
 * arithmetic operators (like addition and multiplication).
 * boolean operators.
 * string operators (used to manipulate strings of text).
 * pointer operators.
 * named operators (operators such as, new, and delete defined by alphanumeric names rather than a punctuation character).

Most of the operators in C++ do exactly what you would expect them to do because most are common mathematical symbols. For example, the operator for adding two integers is +. C++ does allows the re-definition of some operators (operator overloading) on more complex types. This will be covered later on.

Expressions can contain both variables names and integer values. In each case the name of the variable is replaced with its value before the computation is performed.

Order of operations
When more than one operator appears in an expression, the order of evaluation depends on the rules of precedence. A complete explanation of precedence can get complicated, but just to get you started:

Multiplication and division happen before addition and subtraction. So 2*3-1 yields 5, not 4, and 2/3-1 yields -1, not 1 (remember that in integer division 2/3 is 0). If the operators have the same precedence they are evaluated from left to right. So in the expression minute*100/60, the multiplication happens first, yielding 5900/60, which in turn yields 98. If the operations had gone from right to left, the result would be 59*1 which is 59, which is wrong.

Any time you want to override the rules of precedence (or you are not sure what they are) you can use parentheses. Expressions in parentheses are evaluated first, so 2 * (3-1) is 4. You can also use parentheses to make an expression easier to read, as in (minute * 100) / 60, even though it doesn't change the result.

Precedence (Composition)
At this point we have looked at some of the elements of a programming language like variables, expressions, and statements in isolation, without talking about how to combine them.

One of the most useful features of programming languages is their ability to take small building blocks and compose them (solving big problems by taking small steps at a time). For example, we know how to multiply integers and we know how to output values. It turns out we can do both at the same time:

Actually, we shouldn't say "at the same time," since in reality the multiplication has to happen before the output. The point is that any expression involving numbers, characters, and variables can be used inside an output statement. We've already seen one example:

You can also put arbitrary expressions on the right-hand side of an assignment statement:

This ability may not seem so impressive now, but we will see other examples where composition makes it possible to express complex computations neatly and concisely.

The following is illegal:

The exact rule for what can go on the left-hand side of an assignment expression is not so simple as it was in C; as operator overloading and reference types can complicate the picture.

Assignment
The most basic assignment operator is the "=" operator. It assigns one variable to have the value of another. For instance, the statement x = 3 assigns x the value of 3, and y = x assigns whatever was in x to be in y. When the "=" operator is used to assign a class or struct it acts as if the "=" operator was applied on every single element. For instance:

Assignments can also be chained since the assignment operator returns the value it assigns. This time the chaining is from right to left. For example, to assign the value of z to <tt>y</tt> and assign the same value (which is returned by the <tt>=</tt> operator) to <tt>x</tt> you use:

When the "=" operator is used in a declaration it has special meaning. It tells the compiler to directly initialize the variable from whatever is on the right-hand side of the operator. This is called defining a variable, in the same way that you define a class or a function. With classes, this can make a difference, especially when assigning to a function call:

In the first case, <tt>a</tt> is constructed, then is changed by the "=" operator. In the second statement, <tt>a2</tt> is constructed directly from the return value of <tt>foo</tt>. In many cases, the compiler can save a lot of time by constructing <tt>foo</tt>'s return value directly into <tt>a2</tt>'s memory, which makes the program run faster.

Whether or not you define can also matter in a few cases where a definition can result in different linkage, making the variable more or less available to other source files.

Arithmetic operators
Arithmetic operations that can be performed on integers (also common in many other languages) include:


 * Addition, using the  operator
 * Subtraction, using the  operator
 * Multiplication, using the  operator
 * Division, using the  operator
 * Remainder, using the  operator

Consider the next example. It will perform an addition and show the result:

The line relevant for the operation is where the <tt>+</tt> operator adds the values stored in the locations <tt>a</tt> and <tt>b</tt>. <tt>a</tt> and <tt>b</tt> are said to be the operands of <tt>+</tt>. The combination <tt>a + b</tt> is called an expression, specifically an arithmetic expression since <tt>+</tt> is an arithmetic operator.

Addition, subtraction, and multiplication all do what you expect, but you might be surprised by division. For example, the following program:

would generate the following output:

<tt>Number of minutes since midnight: 719</tt> <tt>Fraction of the hour that has passed: 0</tt>

The first line is what we expected, but the second line is odd. The value of the variable minute is 59, and 59 divided by 60 is 0.98333, not 0. The reason for the discrepancy is that C++ is performing integer division.

When both of the operands are integers (operands are the things operators operate on) the result must also be an integer, and by definition integer division always rounds down even in cases like this where the next integer is so close.

A possible alternative in this case is to calculate a percentage rather than a fraction:

The result is:

<tt>Percentage of the hour that has passed: 98</tt>

Again the result is rounded down, but at least now the answer is approximately correct. In order to get an even more accurate answer we could use a different type of variable, called floating-point, that is capable of storing fractional values.

This next example:

will return: Quotient = 6 Remainder = 3

The multiplicative operators <tt>*</tt>, <tt>/</tt> and <tt>%</tt> are always evaluated before the additive operators <tt>+</tt> and <tt>-</tt>. Among operators of the same class, evaluation proceeds from left to right. This order can be overridden using grouping by parentheses, <tt>(</tt> and <tt>)</tt>; the expression contained within parentheses is evaluated before any other neighboring operator is evaluated. But note that some compilers may not strictly follow these rules when they try to optimize the code being generated, unless violating the rules would give a different answer.

For example the following statements convert a temperature expressed in degrees Celsius to degrees Fahrenheit and vice versa:

Compound assignment
One of the most common patterns in software with regards to operators is to update a value:

Since this pattern is used many times, there is a shorthand for it called compound assignment operators. They are a combination of an existing arithmetic operator and assignment operator:



Thus the example given in the beginning of the section could be rewritten as

Pre and Post Increment
Another common pattern is to increase or decrease a value by just 1. This is often used to keep a count of how many times the code has run:


 * a++
 * a--
 * ++a
 * --a
 * count++

We can again use the previous example and rewrite it as: However, while "a++" and "++a" look similar, they can result in different values.

Post Increment: Pre Increment:  In both examples above, a will be incremented by 1. This may not seem like a big difference, however when used in practice, it may not always be equivalent. For example: In the above post increment example, a is assigned to the value of x first, then x is incremented. Since we know that x is 2, a is then assigned the value of 2, afterwards, x is incremented to 3. In the above pre increment example, x is incremented and a is assigned to the incremented value. Since we know that x is 2, and it is being incremented by 1, a is assigned the value of 3, and x has been incremented to 3.

Character operators
Interestingly, the same mathematical operations that work on integers also work on characters.

For the above example, outputs the letter b (on most systems -- note that C++ doesn't assume use of ASCII, EBCDIC, Unicode etc. but rather allows for all of these and other charsets). Although it is syntactically legal to multiply characters, it is almost never useful to do it.

Earlier I said that you can only assign integer values to integer variables and character values to character variables, but that is not completely true. In some cases, C++ converts automatically between types. For example, the following is legal.

On most mainstream desktop computers the result is 97, which is the number that is used internally by C++ on that system to represent the letter 'a'. However, it is generally a good idea to treat characters as characters, and integers as integers, and only convert from one to the other if there is a good reason. Unlike some other languages, C++ does not make strong assumptions about how the underlying platform represents characters; ASCII, EBCDIC and others are possible, and portable code will not make assumptions (except that '0', '1', ..., '9' are sequential, so that e.g. '9'-'0' == 9).

Automatic type conversion is an example of a common problem in designing a programming language, which is that there is a conflict between formalism, which is the requirement that formal languages should have simple rules with few exceptions, and convenience, which is the requirement that programming languages be easy to use in practice.

More often than not, convenience wins, which is usually good for expert programmers, who are spared from rigorous but unwieldy formalism, but bad for beginning programmers, who are often baffled by the complexity of the rules and the number of exceptions. In this book I have tried to simplify things by emphasizing the rules and omitting many of the exceptions.

Bitwise operators
These operators deal with a bitwise operations. Bit operations needs the understanding of binary numeration since it will deal with on one or two bit patterns or binary numerals at the level of their individual bits. On most microprocessors, bitwise operations are sometimes slightly faster than addition and subtraction operations and usually significantly faster than multiplication and division operations.

Bitwise operations especially important for much low-level programming from optimizations to writing device drivers, low-level graphics, communications protocol packet assembly and decoding.

Although machines often have efficient built-in instructions for performing arithmetic and logical operations, in fact all these operations can be performed just by combining the bitwise operators and zero-testing in various ways.

The bitwise operators work bit by bit on the operands. The operands must be of integral type (one of the types used for integers).

For this section, recall that a number starting with 0x is hexadecimal (hexa, or hex for short or referred also as base-16). Unlike the normal decimal system using powers of 10 and the digits 0123456789, hex uses powers of 16 and the symbols 0123456789abcdef. In the examples remember that Oxc equals 1100 in binary and 12 in decimal. C++ does not directly support binary notation, which would hamper readability of the code.


 * NOT
 * ~a   :  bitwise complement of a.
 * ~0xc produces the value -1-0xc (in binary, ~1100 produces ...11110011 where "..." may be many more 1 bits)

The negation operator is a unary operator which precedes the operand, This operator must not be confused with the "logical not" operator, " " (exclamation point), which treats the entire value as a single Boolean—changing a true value to false, and vice versa. The "logical not" is not a bitwise operation.

These others are binary operators which lie between the two operands. The precedence of these operators is lower than that of the relational and equivalence operators; it is often required to parenthesize expressions involving bitwise operators.


 * AND
 * a & b : bitwise boolean and of a and b
 * 0xc & 0xa produces the value 0x8 (in binary, 1100 & 1010 produces 1000)

The truth table of a AND b:


 * OR
 * a | b : bitwise boolean or of a and b
 * 0xc | 0xa produces the value 0xe (in binary, 1100 | 1010 produces 1110)

The truth table of a OR b is:


 * XOR
 * a ^ b : bitwise xor of a and b
 * 0xc ^ 0xa produces the value 0x6 (in binary, 1100 ^ 1010 produces 0110)

The truth table of a XOR b:


 * Bit shifts
 * a << b : shift a left by b (multiply a by $$2^b$$)
 * 0xc << 1 produces the value 0x18 (in binary, 1100 << 1 produces the value 11000)


 * a >> b : shift a right by b (divide a by $$2^b$$)
 * 0xc >> 1 produces the value 0x6 (in binary, 1100 >> 1 produces the value 110)

Derived types operators
There are three data types known as pointers, references, and arrays, that have their own operators for dealing with them. Those are <tt>*</tt>, <tt>&</tt>, <tt>[]</tt>, <tt>-></tt>, <tt>.*</tt>, and <tt>->*</tt>.

Pointers, references, and arrays are fundamental data types that deal with accessing other variables. Pointers are used to pass around a variables address (where it is in memory), which can be used to have multiple ways to access a single variable. References are aliases to other objects, and are similar in use to pointers, but still very different. Arrays are large blocks of contiguous memory that can be used to store multiple objects of the same type, like a sequence of characters to make a string.

Subscript operator [ ]
This operator is used to access an object of an array. It is also used when declaring array types, allocating them, or deallocating them.

address-of operator &
To get the address of a variable so that you can assign a pointer, you use the "address of" operator, which is denoted by the ampersand <tt>&</tt> symbol. The "address of" operator does exactly what it says, it returns the "address of" a variable, a symbolic constant, or a element in an array, in the form of a pointer of the corresponding type. To use the "address of" operator, you tack it on in front of the variable that you wish to have the address of returned. It is also used when declaring reference types.

Now, do not confuse the "address of" operator with the declaration of a reference. Because use of operators is restricted to expression, the compiler knows that <tt>&sometype</tt> is the "address of" operator being used to denote the return of the address of <tt>sometype</tt> as a pointer.

Dynamic memory allocation
Dynamic memory allocation is the allocation of memory storage for use in a computer program during the runtime of that program. It is a way of distributing ownership of limited memory resources among many pieces of data and code. Importantly, the amount of memory allocated is determined by the program at the time of allocation and need not be known in advance. A dynamic allocation exists until it is explicitly released, either by the programmer or by a garbage collector implementation; this is notably different from automatic and static memory allocation, which require advance knowledge of the required amount of memory and have a fixed duration. It is said that an object so allocated has dynamic lifetime.

The task of fulfilling an allocation request, which involves finding a block of unused memory of sufficient size, is complicated by the need to avoid both internal and external fragmentation while keeping both allocation and deallocation efficient. Also, the allocator's metadata can inflate the size of (individually) small allocations; chunking attempts to reduce this effect.

Usually, memory is allocated from a large pool of unused memory area called the heap (also called the free store). Since the precise location of the allocation is not known in advance, the memory is accessed indirectly, usually via a reference. The precise algorithm used to organize the memory area and allocate and deallocate chunks is hidden behind an abstract interface and may use any of the methods described below.

You have probably wondered how programmers allocate memory efficiently without knowing, prior to running the program, how much memory will be necessary. Here is when the fun starts with dynamic memory allocation.

new and delete
For dynamic memory allocation we use the new and delete keywords, the old malloc from C functions can now be avoided but are still accessible for compatibility and low level control reasons.

As covered before, we assign values to pointers using the "address of" operator because it returns the address in memory of the variable or constant in the form of a pointer. Now, the "address of" operator is NOT the only operator that you can use to assign a pointer. You have yet another operator that returns a pointer, which is the new operator. The new operator allows the programmer to allocate memory for a specific data type, struct, class, etc., and gives the programmer the address of that allocated sect of memory in the form of a pointer. The new operator is used as an rvalue, similar to the "address of" operator. Take a look at the code below to see how the new operator works.

By assigning the pointers to an allocated sector of memory, rather than having to use a variable declaration, you basically override the "middleman" (the variable declaration). Now, you can allocate memory dynamically without having to know the number of variables you should declare.

If you looked at the above piece of code, you can use the new operator to allocate memory for arrays too, which comes quite in handy when we need to manipulate the sizes of large arrays and or classes efficiently. The memory that your pointer points to because of the new operator can also be "deallocated," not destroyed but rather, freed up from your pointer. The delete operator is used in front of a pointer and frees up the address in memory to which the pointer is pointing.

The memory pointed to by  and   have been freed up, which is a very good thing because when you're manipulating multiple large arrays, you try to avoid losing the memory someplace by leaking it. Any allocation of memory needs to be properly deallocated or a leak will occur and your program won't run efficiently. Essentially, every time you use the new operator on something, you should use the delete operator to free that memory before exiting. The delete operator, however, not only can be used to delete a pointer allocated with the new operator, but can also be used to "delete" a null pointer, which prevents attempts to delete non-allocated memory (this action compiles and does nothing).

You must keep in mind that <tt>new T</tt> and <tt>new T</tt> are not equivalent. This will be more understandable after you are introduced to more complex types like classes, but keep in mind that when using  it will initialize the <tt>T</tt> memory location ("zero out") before calling the constructor (if you have non-initialized members variables, they will be initialized by default).

The new and delete operators do not have to be used in conjunction with each other within the same function or block of code. It is proper and often advised to write functions that allocate memory and other functions that deallocate memory. Indeed, the currently favored style is to release resources in object's destructors, using the so-called resource acquisition is initialization (RAII) idiom.

As we will see when we get to the Classes, a class destructor is the ideal location for its deallocator, it is often advisable to leave memory allocators out of classes' constructors. Specifically, using new to create an array of objects, each of which also uses new to allocate memory during its construction, often results in runtime errors. If a class or structure contains members which must be pointed at dynamically-created objects, it is best to sequentially initialize arrays of the parent object, rather than leaving the task to their constructors.

The ideal way is to not use arrays at all, but rather the STL's vector type (a container similar to an array). To achieve the above functionality, you should do:

Vectors allow for easy insertions even when "full." If, for example, you filled up, you could easily make room for a 6th element like so:

You can similarly dynamically allocate a rectangular multidimensional array (be careful about the type syntax for the pointers):

You can also emulate a ragged multidimensional array (sub-arrays not the same size) by allocating an array of pointers, and then allocating an array for each of the pointers. This involves a loop.