Probability and Statistics Practice Problem Set/20220801

The following is an archive of the daily practice problems for probability and statistics. The "20220801" in the title of this page refers to the numeric notation for "August 1, 2022" or "2022-08-01." Use these problems to sharpen your problem solving skills for probability and statistics.

Example 1
Find the probability that four randomly selected English letters, A-Z, with repeats allowed, form a name that is among the top 100 names for baby boys born in the United States in 2021.

Solution
The data for boy and girl names and their usage and popularity comes from the Social Security Administration (SSA). The SSA publishes new data annually; each year, the SSA publishes the data for the previous year. Now that we have access to the 2021 data, you can find it on websites such as Behindthename. This dataset covers all U.S. births that occurred between January 1, 2021, and December 31, 2021, inclusive.

For boys, the most popular name last year was Liam, followed by Noah. Both names individually accounted for 1% of baby births each. One percent may not seem like much, but keep in mind that there were roughly two million boys born in the United States last year. The exact numbers are 20,272 boys named "Liam" and 18,739 boys named "Noah", in case you are curious. In descending order by popularity, the four-letter names in the top 100 boy names are Liam, Noah, Jack, Levi, Owen, John, Luke, Ezra, Luca, Ryan, Axel, Jose, and Beau, for a total of 13 names.

To calculate the number of letter combinations with four letters, it helps to realize that there are four positions, and for each position, you can freely choose any of the 26 letters from A to Z. There are 264 = 26 × 26 × 26 × 26 = 456,976 letter combinations that can be formed with four letters. Since each of them is equally likely to be drawn, and since 13 of those combinations form one of those names, the probability is therefore $$\frac{13}{456976}$$, which can be simplified to $$\frac{1}{35152}$$, or approximately 0.000028448.

Example 2
If the letters of the word "medical" are arranged in a random order, using each letter exactly once, what is the probability that it is an English word?

Solution
In addition to giving words' definitions, pronunciations, etymologies, synonyms, and antonyms like most other online dictionaries, another cool thing about Wiktionary is that it gives anagrams, if any, for any given word. An anagram is a permutation of the letters of a word. If a word has no repeated letters, such as the word "train", each letter must be used exactly once. For a word that has one or more repeated letters, each letter must be used exactly as many times as it appears within the word. For example, any permutation of "trait" must use "T" twice and each of "R", "A", and "I" exactly once each.

An example of a set of anagrams is that "post", "stop", and "spot" are all anagrams.

Go to Wiktionary and look up the word "medical." The page lists five anagrams: "camelid," "claimed," "decimal," "declaim," and "maliced." Together with "medical," there are six words formed from these seven letters. There are 7! = 7 × 6 × 5 × 4 × 3 × 2 × 1 = 5,040 ways to rearrange these seven letters, so the probability is thus $$\frac{6}{5040}$$ = $$\frac{1}{840}$$, or approximately 0.001190476.

Example 3
In how many distinct ways can the letters of the word "trainers" be rearranged?

Solution
Be careful! Since "trainers" has 8 letters, the naïve approach would be to simply calculate 8! = 8 × 7 × 6 × 5 × 4 × 3 × 2 × 1 = 40,320, but before you bask in the glow of success, this is not the correct answer. The word "trainers" has a repeated letter, using the letter "R" twice, and you must account for this. Remember, whenever a letter has multiple of the same letter, all instances are treated as indistinguishable, non-distinct objects.

The answer is thus $$\frac{8!}{2!}$$ = $$\frac{40320}{2}$$ = 20,160.

Example 4
It is a well-known fact that sugar rushes (also known as sugar highs), the oft-imagined effect of hyperactivity in kids who eat sugar, is a myth. This misconception has been debunked by science time and time again. Note that "sugar rush" and "sugar high" are interchangeable. If a random four-letter combination is drawn, what is the probability that the string, when appended after the word "sugar", forms a name for this nonexistent, oft-debunked phenomenon?

Solution
"Sugar rush" and "sugar high" are equivalent terms for the supposed effect of getting hyper after eating too much sugar. Thanks to science, you can rest assured that the old wives' tale about sugar causing hyperactivity is just that — a myth. Since both "rush" and "high" are four-letter words, and no other sugar phrase, let alone one in which the first word is "sugar" and the second word has four letters, means the same thing as "sugar rush" and "sugar high", these are the only letter combinations that result in a name for this mythical concept. Other sugar idioms include "sugar baby", which is a term of endearment, "sugar tit", which is a fancy term for a baby's pacifier, and "sugar beet", a type of beet whose root contains a high concentration of sucrose. Most four-letter combinations, however, are not words at all, and are instead nonsensical gibberish. However, these are all allowed as the criteria in this problem give no restrictions on the choice of the four letters.

264 = 456,976. There are two four-letter words for which "sugar ____" is a term for hyperactivity after eating sugar, so the probability is thus $$\frac{2}{456976}$$ = $$\frac{1}{228488}$$ = 0.000004377. In scientific notation, you can also rewrite this as 4.377 × 10−6.

Example 5
In Generation VI, the sixth generation of Pokémon, there were 721 different species of Pokémon. For the month of September 2016, there were 16 Pokémon species with a weighted usage of 10% or greater, according to Smogon's monthly usage statistics for that month. If a Pokémon species is chosen randomly among the 721 Pokémon, what is the probability that its weighted usage was greater than or equal to 10% for the month of September 2016?

Solution
Since there are 721 different kinds of Pokémon, and 16 of them had a weighted usage of at least 10% for the month of September 2016, the probability is then $$\frac{16}{721}$$, or approximately 2.219%.

Fun Fact
And in case you are curious what those 16 were, Landorus-T came in first place with a weighted usage of 36.99105%. Latios came in second, with 22.34431%. Third was Excadrill, with 19.54580%.