Primary Mathematics/Probability

Probability
In mathematics probability tells us how likely something is to happen.

If something is absolutely certain to happen, we say its probability is one. If something cannot possibly happen, we say its probability is zero. A probability between zero and one means we don't know for sure what will happen, but the higher the number the more likely something is to happen. If something has a probability of 0.25, most people would say "it probably won't happen".

Unfortunately, children have a hard time understanding what a probability of 0.25 suggests. Children are much more adept in understanding probability in terms of percentages (25%), and fractions (i.e., "It will happen about ¼ of the time"), so it is very important that connections between probability and the development of those skills and understandings occur together.

Models
It can be difficult but Probability is best taught with models. Two of the most effective and commonly used models used to teach probability are area and tree diagrams

Area diagrams give students an idea of how likely something is by virtue of the amount of space reserved for that probability. Consider the toss of a coin:



Students can imagine that every time they toss a coin, they need to place that coin into the appropriate side of the rectangle. As the number of throws increases, the amount of area needed to enclose the coins will be the same on both sides. The same probability can be shown using a tree diagram:



In this simple tree diagram, each line (or event path) will be followed ½ of the time. Note that both diagrams show all of the possibilities, not just the probability of one event.

Both of these models can become more complex. Consider all of the possibilities of throwing two coins in series:



In the above area rectangle the instances where heads has been tossed twice in a row is represented in the upper left hand corner. Note that whereas the probability of tossing both heads and tails (in any order) is 50% or 1/2, and the probability of tossing two heads in a row is 25% or 1/4.



While the tree diagram gives us the same information, the student needs to understand that all of the events at the bottom have an equal chance of occurring (This is more visually evident in an area diagram). However, the tree diagram has the advantage of better showing the chronology of events (in this case the order of events is represented by downward motion).

While it may appear that both of these models have strengths and weaknesses, that is not the point. By becoming familiar with these and other models, students gain a more robust and diverse understanding of the nature of probability.

Consider the following problem:

If at any time a Dog is just as likely to give birth to a male puppy as she is to give birth to a female puppy, and she has a litter of 5 puppies, what is the probability that all of the puppies will be of the same sex?

While this problem can be easily answered with the a standard formula, $$P = 2 \left ( \frac{1}{2} \right ) ^n$$, where n is the number of puppies born, this nomenclature is not only unintelligible to students, but teaching it to students gives them no understanding of the underlying concepts. On the other hand, if they have explored problems similar to this with a tree diagram, they should fairly easily be able to see patterns emerging from which they can conjecture the "formula" that will find the correct answer. Consider that a student has drawn a tree diagram that shows the permutations with three puppies in the litter:



If a teacher was to have this student share their work, the class of students in looking at this model would be able to see that there are 8 possible outcomes. They should also notice that the only instances where litters of all males or all females can be found are on the far ends of the tree.

Students are always encouraged to look for patterns in many of their mathematical explorations. Here, students start to notice that the number of possibilities can be determined by multiplying the number 2 by itself exactly the number of times there is to be a new outcome. In this case there are three outcomes (puppies born) in a row. So 2 x 2 x 2 = 23, or 8 possible outcomes. Once the student sees a connection to this pattern, they may conjecture that if there are 5 puppies in a litter, the possible number of outcomes is 25, or 32, of which only 2 (represented by the combinations MMMMM, and FFFFF on the extreme ends) will result in a litter puppies of all the same sex.

Teachers are always looking for students to make connections with other skills they are learning. In this case the teacher might ask the students to state the answer as a reduced fraction (1/16) and a percentage (6.25%).

Since we're using math to help us understand what will happen, it may help us to be able to write an equation like this:

Pr(X) = y

If this equation is true, we would say "the probability of X happening is y". For example, we can say:

X = A coin toss coming up heads

Pr(X) = 0.5

Since the letter X is standing for an event, not a number, and we don't want to get them confused, we'll use capital letters for events, and lower-case letters for numbers. The symbol Pr represents a function that has some neat properties, but for now we'll just think of it as a short way of writing "the probability of X".

Since probabilities are always numbers between 0 and 1, we will often represent them as percentages or fractions. Percentages seem natural when someone says something like "I am 100% sure that my team will win." However, fractions make more sense for really computing probabilities, and it is very useful to really be able to compute probabilities to try to make good decisions, so we will use fractions here.

The basic rule for computing the probability of an event is simple, if the event is the kind where we are making a definite choice from a known set of choices based on randomness that is fair. Not everything is like that, but things like flipping a coin, rolling a single die, drawing a card, or drawing a ticket from a bag with your eyes closed are random and fair since each of the possibilities is equally likely. But the probability of things a little more complicated, like the sum of two dice, is not so easy---and that makes it fun!

Suppose we have a single die. It has six faces, and as far as we know each is equally likely to be face up if we roll the die. We can now use our basic rule for computing probability to compute the probability of each possible outcome. The basic rule is to take the number of outcomes that represent the event (called X) that you're trying to compute the probability of, and divide it by the total number of possible outcomes. So, for rolling a die:

Pr("getting a one") = 1 face that has one dot / 6 faces = 1/6

Pr("getting a two") = 1 face that has two dots / 6 faces = 1/6

Pr("getting a three") = 1 face that has three dot / 6 faces = 1/6

Pr("getting a four") = 1 face that has four dots / 6 faces = 1/6

Pr("getting a five") = 1 face that has five dot / 6 faces = 1/6

Pr("getting a six") = 1 face that has six dots / 6 faces = 1/6

So we have 6 possibilities, and each has a probability of 1/6. We know that 6 * (1/ 6) = 1. In fact, that is an example of a fundamental rule of probabilities:

The sum of probabilities of all possibilities is equal to one.

But you probably know that each face of a die is equally likely to land face-up, so so far we haven't done anything useful. So let's ask the question: What is the probability of throwing two dice and having their sum equal seven when we add them together? Knowing that can help us win at a lot of games (such as Monopoly(trademark Parker Brothers)), so it's pretty valuable to know the answer.

To find out the answer, we apply our basic rule, but now it is not so easy to know how many possible outcomes of TWO dice thrown together will equal seven. However, it is easy to know how many total possible outcomes there are: 6 times 6 = 36. One way to find out how many possible dice throws add up to seven is just to make a table of all possible dice throws that fills in their sum and count the number of sevens. Here is one:

***| 1 | 2 | 3 | 4 | 5 | 6  ===========================   1  | 2 | 3 | 4 | 5 | 6 | 7   2  | 3 | 4 | 5 | 6 | 7 | 8   3  | 4 | 5 | 6 | 7 | 8 | 9   4  | 5 | 6 | 7 | 8 | 9 | 10   5  | 6 | 7 | 8 | 9 | 10| 11   6  | 7 | 8 | 9 | 10| 11| 12

The numbers along the top represent the value on the first die and the values on the left represent the values of the second die when you throw them together. The values inside the table represent the sum of the first and the second dice. (Note that the numbers range from 2 to 12 --- there's no way to throw two dice and get a number less than two!)

So if we look carefully, we can see that there are exactly 6 sevens in this table. So the probability of getting a seven if you throw two dice is 6 out of 36, or:

Pr(getting a 7) = 6 / 36 = 1 / 6.

(1 / 6 is a simple fraction equal in value to 6 / 36.)

Maybe that doesn't surprise you, but let's look at all the probabilities:

Pr(getting a 2) = 1 / 36 Pr(getting a 3) = 2 / 36

Pr(getting a 4) = 3 / 36

Pr(getting a 5) = 4 / 36

Pr(getting a 6) = 5 / 36

Pr(getting a 7) = 6 / 36

Pr(getting a 8) = 5 / 36

Pr(getting a 9) = 4 / 36

Pr(getting a 10)= 3 / 36

Pr(getting a 11)= 2 / 36

Pr(getting a 12)= 1 / 36

What do you think we would get if we added all of those probabilities together? They should equal 36 / 36, right? Check that they do.

Did you know that the probability of a getting a 5 is four times that of getting a two? Did you know that probability of getting a five, six, seven or eight is higher than the probability of other 7 numbers all together? We can tell things like that by adding the probabilities together, if the events we're talking about are completely independent, as they are in this case. So we can even write:

Pr(getting a 5, 6, 7 or 8) = Pr(getting a 5) + Pr(getting a 6) + Pr(getting a 7) + Pr(getting an 8)

But we can substitute these fractions and add them up easily to get:

Pr(getting a 5, 6, 7, or 8) = (4+5+6+5)/36 = 20/36.

Since we know the probability of all possible events must equal one, we can actually use that to compute the probability:

Pr(getting a 2,3,4,9,10,11 or 12) = 1 - (20/36) = (36/36 - 20/36) = 16/36.

This way we didn't have to add up all those individual probabilities, we just subtracted our fraction from 1.

Mathematically we could say: Pr(some events) = 1 - Pr(all the other possible events)

You might want to take the time now to get two dice and roll them 3*36 = 108 times, recording the sum of them each time. The number you get for each sum won't be exactly 3 times the probabilities that we have computed, but it should be pretty close! You should get around 18 sevens. This is fun to do with friends; you can have each person make 36 rolls and then add together all your results for each sum that is possible.

The fact that if you make a lot of rolls it is likely to get results close the probability of an outcome times the number of rolls is called the Law of Large Numbers. It's what ties the math of probability to reality, and what lets you do math to make a good decision about what roll you might get in a game, or anything else for which you can compute or estimate a probability.

Here's a homework exercise for each of you: compute the probabilities of every possible sum of throwing two dice, but two different dice: one that comes up 1,2,3 or 4 only, and the other that comes up 1,2,3,4,5,6,7 or 8. Note that we these dice (which you can get at a gaming store, if you want) the lowest and highest possible roll is the same but the number of possible rolls is only 4 * 8 = 32.

***| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8  =====================================   1  | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9   2  | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10   3  | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11   4  | 5 | 6 | 7 | 8 | 9 | 10| 11| 12

Pr(getting a 2) = 1 / 32 Pr(getting a 3) = 2 / 32

Pr(getting a 4) = 3 / 32

Pr(getting a 5) = 4 / 32

Pr(getting a 6) = 4 / 32

Pr(getting a 7) = 4 / 32

Pr(getting a 8) = 4 / 32

Pr(getting a 9) = 4 / 32

Pr(getting a 10)= 3 / 32

Pr(getting a 11)= 2 / 32

Pr(getting a 12)= 1 / 32

(You should check our work by making sure that all these probabilities sum to 32/32 = 1).

So this is interesting: Although the possible scores of the throwing two dice are the same, the probabilities are a little different. Since 1/32 is a little more than 1/36, you are actually more likely to get a two or a twelve this way. And since 4 /32 = 1/8 is less than 6 / 36 = 1/6, you are less likely to get a seven with these dice. You might have been able to guess that, but by doing the math you know for sure, and you even know by how much, if you know how to subtract fractions well.

So let's review what we've learned:


 * We know what the word probability means.
 * We know probabilities are always numbers between 0 and 1.
 * We know a basic approach to computing some common kinds of probabilities.
 * We know that the probabilities of all possible outcomes should add up to exactly 1.
 * We know the probability of a result is equal to one minus the probability of all other results.