Introduction to Game Theory/Deal Or No Deal

This article concerns itself with the game theory and optimal strategy for playing the popular television game show, Deal or No Deal.

One basic strategy is for a contestant to act so as to maximize the expected value of his prize. At each point in the game where the banker makes an offer, the contestant can maximize his expected value by choosing the offer if it is greater than the average value of the unopened cases and declining it when it is less. However, the offers from the banker almost never exceed the average value of the unopened cases. Thus, if contestants always chose this strategy, the game would be boring as it would consist of contestants always declining offers and continuing to open cases until the end (or until the banker's offer exceeds expected value).

However, the game becomes interesting when other strategies that involve optimizing for parameters beyond expected value. A contestant who declines the banker's offer is accepting the risk that he may win less than that offer. Different people have different degrees of risk tolerance. For instance, if offered the choice between getting $400,000 against taking an even chance of winning either $1 or $1,000,000 many contestants would prefer to accept the $400,000 despite that fact that if one's strategy was simply to maximize the expected value, the contestant should reject the offer. The reason why many contestants would accept the $400,000 is due to the fact that people maximize their utility of money and not the expected value directly. The utility of money diminishes as a person gains more money, for example the utility of the first $400,000 the contestant receives may be worth much more than the next $600,000 received.

The game's deceptively simple format has attracted attention from mathematicians, statisticians, and economists as a study of decision making under risk: It is an excellent instructive example of the application of utility theory. In 2004, a team of economists played a scaled-down version of the game with 84 participants and compared the results with the expected utility hypothesis. The study received a great deal of media attention, appearing on the front page of The Wall Street Journal on January 12, 2006 as well as being featured on National Public Radio in the United States on March 3, 2006.

Modeling strategy
At the start of a standard game of the U.S. version of Deal or No Deal, the expected value of the game is $131,477.54, the average of the 26 cases. However, only 6 of the 26 cases have values greater than the expected value, and the median value of the 26 cases is only $875. Before any deal is offered, the contestant must select 6 cases to be eliminated. Thus the expected value of the cases at the time of the first offer could range from a low of $13,420.80 to a high of $170,916.25 and the median value from a low of $350 to a high of $17,500, leading to considerable variability in playing conditions by the time players must first make decisions in the game.

The banker's perspective
One should not consider only the perspective of the contestant who wins the prize, as this is a two player game, with the banker being the other player. Because the banker plays a large number of games, which greatly reduces his risk, he can be extremely tolerant of the remaining risk and adopt a strategy that seeks to minimize the expected value of the prizes that contestants win. However, the expected value he seeks to minimize is not on a per game basis, as is the case for the contestant, but on a per hour of play basis. A strategy that would cause an average contestant to play for an hour and win $100,000, is more advantageous to the banker than a strategy that would cause an average contestant to play for only a quarter hour and win $50,000. This helps to explain why the banker's first few offers are only a small fraction of the expected value of the remaining cases. An optimal banker will only make offers that from his perspective improve upon the expected value per hour of play. From the contestant's perspective this becomes a premium he must pay to end the game early.

The contestant's perspective
Typically a contestant enters the game with an idea of what he considers to be a minimum satisfactory prize. So long as the contestant can continue the game without significant risk of knocking out all of the prizes above that minimum, he will do so. What the contestant considers satisfactory may change during the game as the remaining prizes change. For example, when the second most valuable prize remaining is $75,000, a contestant will be more satisfied with the risk that it could become the most valuable remaining prize, if the current most valuable remaining prize is $100,000 than if it were $1,000,000. Thus a player is likely to desire to continue playing while the risk is low. By the time of the fourth offer there are eight remaining cases. At this point in the game there is only a 4.8% chance that the contestant will have eliminated all of the prizes worth $100,000 or more, and a 72.8% chance that there will be at least two such prizes still in the game. Hence a contestant is also unlikely to accept an early offer.

The Start of the Game

At the start of the game, the contestant is allowed to choose one case. The odds of actually selecting the $1,000,000 case are 3.85%. It is interesting to note that many contestants actually believe that they have chosen the $1,000,000 case, even though the odds of not selecting the million dollar case are 96.15%. Many of these contestants make decisions based on believing that they have chosen the million dollar case (or one of the other five higher-value cases). As the game goes on, this generally causes the contestant to take unnecessary risks and to not accept the banker's offer, even though the offer may be higher than the expected outcome of the average value of the remaining cases. This phenomenon of human behavior (i.e., delusion) works against the contestant almost every time.

End game
The desirability of continuing the game diminishes for both players as the number of prizes satisfactory to the contestant remaining in the game diminish. With fewer prizes that the contestant considers satisfactory remaining, the risk to the contestant increases. This increased risk lowers the percentage of the estimated value of the remaining cases that an offered deal needs to be attractive to the contestant. Conversely, the banker, since he is trying to also optimize the entertainment value of the game, may need to end a game quickly if the values of the remaining cases are all small. To end the game he needs to increase the percentage of the estimated value of the remaining cases that he offers for a deal, occasionally even above the estimated value.

Comparison with the Monty Hall problem
When only three cases remain, Deal or No Deal might seem like a version of the Monty Hall problem. Consider a Deal or No Deal game with three cases (similar to the three doors in the Monty Hall problem). The contestant has one case. Then, one of the two other cases is opened. Finally, the contestant is given the option to trade his or her case for the one unopened case remaining.

The Monty Hall problem gives the contestant a 2/3 chance of winning with a switch and a 1/3 chance of winning by keeping his or her case. However, there is a critical difference between Let's Make a Deal and Deal or No Deal. In the Monty Hall problem, the host has used his secret knowledge of what lies behind each of the three doors to cause a bad choice to always be revealed. This non-random selection of a bad choice is what causes the difference in odds of winning between switching and not switching on Let's Make a Deal. This causes Deal or No Deal to behave like the Ignorant Monty Hall problem.

Analyzing decision making under risk
A team of economists - Post, Van den Assem, Baltussen & Thaler (report) - have analyzed the decisions of people appearing in Deal or No Deal and found, among other things, that contestants are less risk averse when they have seen their expected winnings tumble. "Losers" tend to continue playing the game even if this means rejecting bank offers in excess of the average of the remaining prizes. A separate experimental study (report) with student-subjects playing the game with scaled down prizes reveals a similar pattern. The findings provide support for behavioral economists, who claim that the classical expected utility theory falls short in explaining human behavior by not accounting for the context of decisions. The study of the four economists is unique, for the underlying "experiment" Deal or No Deal is characterized by high stakes, a transparent probability distribution and only simple stop-go decisions that require minimal skill or strategy.

This particular study attracted some media attention in the United States, including coverage on the front page of the Wall Street Journal, January 12, 2006, and National Public Radio, March 3, 2006.