(Swans - January 30, 2012) We are often caught in dilemmas, uncertain about choosing between two courses of action, and sometimes suspicious that "the game is rigged" so that whatever choice we make will benefit a behind-the-scenes controller. One interesting way of exploring this question is to formulate simple idealized situations, which can be taken as analogies to some of the real-world complexities in our lives, and analyze them with the Bayesian model of deliberation.
Thomas Bayes (1701-1761) was an English mathematician and Presbyterian minister who combined logic with the notion of probability as a partial belief instead of a frequency of occurrence, to formulate the topics in mathematics and epistemology now known as Bayesian probability and Bayesian deliberation, respectively.
The Prisoner's Dilemma
We begin by considering a classic example of Bayesian deliberation, the Prisoner's Dilemma. The statement of this problem is taken from the book The Logic of Decision by Professor Richard Jeffrey:
Two men are arrested for armed robbery. The police, convinced that both are guilty, but lacking sufficient evidence to convict either, put the following proposition to the men, and then separated them. If one man confesses but the other does not, the first will go free while the other will receive the maximum sentence of 10 years; if both confess, they will both receive light sentences of 5 years; and if neither confesses, they will both be imprisoned for jaywalking, vagrancy, and resisting arrest, with a total sentence of 1 year each. Why are the police convinced that both will confess, even though it would be better for both if neither confessed? (1)
First, we lay out a matrix showing the desirabilities of the four possible outcomes. Here, the desirabilities are quantified in years, with a negative sign to indicate imprisonment.
Prisoner's Dilemma, |
He confesses |
He stays mum |
I confess |
-5 years |
0, freedom! |
I stay mum |
-10 years |
-1 year |
Row 2 lists the two possible consequences that could follow after "I confess." Row 3 lists the two possible consequences that could follow should "I stay mum." Each prisoner must make this choice between the mutually exclusive courses of action of confessing or staying mum, while under a cloud of uncertainty as to what his partner in crime will do. Their fates are mutually contingent.
Let the letter P symbolize the probability that "he confesses." P is some number between 0 and 1. Since there is absolute certainty that the other guy will make one of the two choices, we can see that the probability that "he stays mum" is the expression (1-P). (2)
To quantify the "utility to the subject" (i.e., the benefit or cost to "me") of the occurrence of each of the four potential outcomes, we form a utility matrix. The utility of each outcome is the product of its desirability and its probability of occurrence.
Prisoner's Dilemma, |
He confesses |
He stays mum |
I confess |
-5P |
0(1-P) = 0 |
I stay mum |
-10P |
-1(1-P) |
If it is absolutely certain that he will confess (P=1), then the consequences I face are only those of the second column ("He confesses"); and the utility of my confession is a 5-year sentence, while the utility of my silence is a 10-year sentence.
If it is absolutely certain that he will remain silent (P=0), then the consequences I face are only those of the third column ("He stays mum"); and the utility of my confession is freedom, while the utility of my silence is a 1-year sentence.
Clearly, if I am certain my partner will confess then I have no choice but to do likewise and spend 5 years in prison. If I am certain my partner will not betray me then I can gain my freedom immediately by betraying him.
But, how do I judge if I believe the chances of my partner betraying me are uncertain; that probability P is somewhere between 0 and 1? I form expectations.
The expectation of a benefit or cost to me for choosing the course of action "confession" is a quantity that is contingent on my guess or estimate of the probability that my partner in crime will betray me. The expectation of benefit or cost for my staying mum is another similarly contingent quantity. The expectation expression for a course of action is formed by adding the utilities for that course of action:
E(I confess) = -5P
E(I stay mum) = -10P + -1(1-P) = -9P-1
Given a specific value of P, we can find specific expectation values for each of the pair of actions. The more favored course of action will have the higher expectation value (in the sense of the larger number in the positive direction).
Since P is an unknown number (a suspicion) between 0 and 1, we can write the following mathematical statements about the ranges of possible expectation values:
1 > P > 0, (3)
-5 < E(I confess) < 0, (4)
-10 < E(I stay mum) < -1, (5)
For example, at P=0.5, E(I confess)=-2.5, and E(I stay mum)=-5.5. As you can see, a prisoner who estimates his odds of being betrayed at 50% will find it better to confess.
Any mathematically-inclined prisoner who arrives at this point in his deliberations will then realize that perhaps one of the courses of action is more favored when the probability of being betrayed is low, and vice versa. Is there a crossover of expectations at some mid-range value of P? In mathematical terms, do the straight lines, which represent expectation value expressions (-5P and -9P-1, respectively) plotted against P from 0 to 1, cross? If so, at what value of P, and which are the favored choices on either side of that value?
Let us assume that there is a value of P where E(I confess) and E(I stay mum) have equal expectation value (where the two E vs. P lines cross);
assuming E(I confess) is equal to E(I stay mum), then:
-5P =? -9P-1,
(we put a question mark next to the equal sign to remind us that we are testing the validity of the statement,)
and simple algebra produces P =? -1/4.
Since probability can only be positive, we see that one course of action always has higher expectation value regardless of the probability (for P between 0 and 1). In fact that course of action is "I confess."
If we plot the expectation values against P from 0 to 1, we find that E(I confess) plunges from 0 (at P=0) to -5 (at P=1), while E(I stay mum) plunges from -1 to -10. The descending line representing E(I stay mum) vs. P is always below the descending line representing E(I confess) vs. P.
Though the most equitably desirable outcome is obtained by both partners keeping mum, the non-zero possibility that either might betray the other makes confessing the logical choice for both. (6)
Perhaps because the situations, or games that frame our real-life choices, can be too complex compared to the Prisoner's Dilemma to see clearly, with many contingencies and numerous possible courses of action not all mutually exclusive, this simple example of Bayesian deliberation is valuable for showing us that, yes, games can be constructed to funnel our "free will" to the benefit of outside controllers, and that escaping the intellectual prison of the game may require transcending the level of thinking or self-interest that the game designer presumes we will restrict ourselves to.
Captives can escape the Prisoner's Dilemma by displaying unshakeable solidarity. The police who designed this game presumed that such highly honorable behavior is unlikely among the suspected criminals they arrest. Such escape through transcendence has occurred often in history, for example when the suspects were political criminals like the men and women of the Resistance to the Nazi Occupations in Europe between 1940 and 1945. In such cases "escape" is sometimes at best metaphorical, as men and women have been know to suffer death by torture in order to not betray their comrades to the political police of an oppressive regime.
We must realize that both the desirabilities and probabilities we use in a Bayesian deliberation are usually subjective: we choose based on our estimate of the odds for the occurrence of the contingencies, weighted by our preferences.
Now, let us consider three idealized Bayesian deliberations that suggest parallels to present day American society: Black Friday, Debt Affluence, and Lesser Evil Voting.
Black Friday
You are convinced that a number of items being offered for sale on the Friday after Thanksgiving, "Black Friday," the first day of the Christmas shopping season, are absolutely essential to possess. Do you camp out Thanksgiving night by the store's entrance, and bring an aerosol can of pepper spray to be able to fight your way through the competition to get ahead, or do you wait for even a day, assuming supplies will last?
You draw up the following desirability matrix, using values of +10 for complete satisfaction, 0 for a neutral attitude, and -10 for absolute disappointment.
Black Friday, |
1st day a mob scene |
1st day is calm |
I rush |
+1, "Get them while |
+10, "I got to pick first" |
I wait |
-10, "I missed out" |
-1, "Fewer choices of |
Setting P as the probability for a first day mob scene, we can form the following utility matrix.
Black Friday, |
1st day a mob scene |
1st day is calm |
I rush |
+1P |
+10(1-P) |
I wait |
-10P |
-1(1-P) |
The expectations are:
E(I rush) = -9P+10
E(I wait) = -9P-1
For P from 0 (no mob scene) to 1 (mob scene assured): E(I rush) varies from +10 to +1, and E(I wait) varies from -1 to -10. Clearly, E(I rush) is greater than E(I wait) for any value of P.
If you, the shopper, are absolutely convinced you must have this thing, and exactly in the form or color you want it in, then you must camp out and fight your way through the first day mob. If the suppliers were to have a sufficiently large stock of the desirable item, then there would be little penalty to waiting till later in the shopping season (which would be reflected in the desirability matrix by increasing the -10 value to something perhaps between -2 and +2 to reflect the possible reduction of selection, for example for color, by shopping at a later time).
The supplier could make similar calculations for his own purposes: limiting the supplies available on Black Friday to ensure a "rush" psychology on the part of buyers (indoctrinated to desire the item by previous advertising), but being careful to set that quantity at the optimum between the competing requirements of maximizing sales (bigger supply) and ensuring the rush (smaller supply).
Maintaining your dignity while Christmas shopping will often mean waiting; and transcending this game entirely is simply not allowing yourself to be indoctrinated to want stuff.
Debt Affluence
The economy is growing at a fever pitch, and people are making fortunes by speculating on the stock market and in real estate. However, to make significant profits one has to invest large amounts of money so the expenses, fees, and taxes attendant to trading do not eat up the meager profits that come to small accounts. Savvy traders are borrowing against the equity in their homes to accumulate large bundles of cash they can invest in high yield and high risk stocks, and real estate speculation. This borrowing also maintains affluent lifestyles: vacation homes, elite college educations for their children, fine German automobiles with leather upholstery, and many accessories and occasions for entertainment. Do you jump into this game, or do you miss out on a good life today and a fortune tomorrow by being too cautious about losing any of your family's modest savings?
Let us measure desirability in units of years of income at the subject's present rate of earning (we presume at a regular middle-class job), with a plus sign for outcomes that boost gains, and a minus sign for outcomes that diminish family wealth from what it could have been (not from what it is now) given a stable economy. Consider the following debt affluence desirability matrix.
Debt Affluence, |
economy just grows |
economy crashes |
I am frugal |
-5, "I missed out on the boom" |
-1, "I didn't lose any principal" |
I borrow to gamble |
0, "I'm keeping up" |
-10, "I'm bankrupt" |
Designating P as the probability of an economic crash, we arrive at the following utility matrix and expectation value expressions.
Debt Affluence, |
economy just grows |
economy crashes |
I am frugal |
-5(1-P) |
-1P |
I borrow to gamble |
0(1-P) = 0 |
-10P |
E(I am frugal) = 4P-5
E(I borrow to gamble) = -10P
For P ranging from 0 (endless growth) to 1 (certainty of a crash), E(I am frugal) varies from -5 to -1, and E(I borrow to gamble) varies from 0 to -10. The two expectation expressions have the same value for P = 5/14, or a 36% probability of a crash.
For the person reflected by the desirability matrix of this example, if the probability of a crash is less than 36% (5/14) then it is more advantageous to gamble with borrowed money; if the probability of a crash is over 36% then it is wiser to be frugal and live within present means.
A wise investor would be a person who did not assume that his or her estimate of P, the probability of an economic crash, was extremely precise. Rather than choosing between the two courses of action on the basis of refining an estimate of P down from, say, between 30% and 40%, it would be wiser to make the selection based on sound economic reasoning that showed P to be either close to 0 or close to 1. Certainly, people with greater expertise may have the ability to arrive at more precise estimates of P, but all human beings can be swept away by their fantasies, and those who imagine themselves experts are most easily toppled by their hubris.
Notice that the most desirable outcome (utility=0) requires a large dose of wishful thinking (the economy only grows) and performing the continuous work of being a debtor-speculator surfing this supposedly never-ending economic wave.
People who wanted to insulate themselves as much as possible from the damage of an economic crash would have chosen to be frugal in the years before the crash. They might lose a little from what the economy would have been had a crash not occurred, for example by a reduction in home values, or lowered interest rates on savings accounts, or a loss of customers to a business. Frugality can be thought of as a business expense for self-insurance against complete lifestyle collapse.
As people become less attached to the idea of acquiring a fortune, and more allergic to the prospect of bankruptcy, they become increasing happy with the choice of frugality. They liberate themselves from the debt affluence game by living debt-free, and transcending the desire for affluence.
Voting for the Lesser Evil
Now, we consider a model of determining how to vote in a hypothetical American presidential election, from the perspective of a very liberal voter.
Our voter finds that the Green Party platform fully captures his/her beliefs of how national affairs should be managed. This person has always voted for Democratic Party candidates, identifying with the feeble left wing of that party, but has become disenchanted with the rightward drift of the party since the Clinton Administration on through these first three years of the Obama Administration. Additionally, this voter is appalled at the damage Republican politicians have caused the country since the Reagan Administration, and wants to ensure his/her vote is most effectively deployed to keep them out of office. Voting for the Green Party would help it grow so it might achieve 5% popular representation nationally and thus qualify for matching funds from the Federal Elections Commission four years later. Then, Green Party candidates would have a real chance of winning higher offices. However, this voter wants to be assured that any support given to the Green Party will not weaken the Democratic Party to the point of throwing the national election to the Republicans.
Our mathematically-inclined Green-Democrat might quantify political desirabilities in units of years; each desirability being made up of three components: Republican, Democratic, and Green, assigned as follows.
One year of a Republican administration equals two years of national damage. One year of Republicans out of office or in the minority equals two years of national recovery from the damage they caused when last in power.
Today's Democratic Party counters much of the damage the Republicans try to cause, and maintains many of the remaining programs from the glory days of social democracy, but it is no longer an enthusiastic engine of national rejuvenation. So, the Democratic Party is seen as having a nearly neutral effect when in power, and is capable of maintaining its strength when marginally out of power. But, it would suffer drastic decay (on a yearly basis) if simultaneously undermined by the growth of third parties, and overwhelmed by Republican electoral victories.
A clear increase in the number of Green Party voters since the last presidential election is taken to equal one year of improvement for the political attitude of the nation; a clear decrease over the four-year span is counted as a one-year loss of political power for the Green-Democratic point of view.
A desirability matrix reflecting this Green-Democratic voter's perspective for the given hypothetical presidential election would look as follows.
Desirability Matrix, |
Romney wins |
Obama wins |
Vote Obama |
Republican = -8 |
Republican = +8 |
Vote Green |
Republican = -8 |
Republican = +8 |
Designating P as the probability that Romney wins, the utility matrix for this voter is the following.
Lesser Evil Voting, |
Romney wins |
Obama wins |
Vote Obama |
-9P |
+7(1-P) |
Vote Green |
-11P |
+9(1-P) |
The expectations are:
E(Vote Obama) = -16P+7
E(Vote Green) = -20P+9
These expectations have the same value when P= 1/2 (50%).
If this voter believes that Romney's chances of winning are less than 50% then he/she will be more satisfied voting for the Green Party. Conversely, if this voter believes that Romney has a greater chance of winning than Obama, then he/she would feel far less guilty afterward by staying loyal to the Democratic Party.
Recalling the discussion of the supplier's strategy in the Black Friday example, it is interesting to consider similar calculations on the part of Democratic Party strategists. If they thought they had a large population of voters with an outlook similar to our Green-Democratic person, they might try to advance the perception that Romney had close to a 50% chance of winning (but not overstating it, and shaking Democrats' confidence), even if their scientific polling showed Romney lagged badly, because such a perception would keep their left-leaning Democrats safely under the Party leadership's control.
The best way to transcend the game of voting for the lesser evil is to look beyond voting as the only way to solve the nation's problems.
Revising Your Preferences
When using Bayesian logic to help in your deliberations, you quickly find that revising your preferences, or desirabilities, is the surest way of eliminating uncertainty. Simply put, a person clear about their commitments and willing to accept the costs of maintaining them will always see the right choice to make.
In the case of our conflicted Green-Democrat voter, the uncertainty about voting Democrat or voting Green arises because their prime motivation is a negation, to prevent something they dislike, rather than an affirmation, to promote something they believe in.
A committed Green Party voter affirms her desire for the growth of her party, and accepts the unavoidable and necessary costs of that desire: that Republicans and Democrats will continue their duopoly of misrule for years to come. Such Green Party voters are committed to ending the period of "duoligarchic" misrule, by contributing to the growth of the Green Party so it can eventually displace our current ancien régime.
Similarly, a committed left-wing Democratic Party voter accepts the belief that it is always preferable to have Democrats than Republicans in office, and he accepts the cost for maintaining this belief, which is the abandonment of his ideological convictions by not voting Green nor causing dissension in the Democratic Party from its left.
If American voters were to truly affiliate and vote affirmatively, rather than from habit and fear, the monolithic Democrat-Republican oligarchic duopoly would crumble from both the right and the left, reducing American politics to a parliamentary rubble. This would be a welcome development.
In mulling over your own Bayesian problems, you will more than likely clarify your preferences about the potential outcomes you face, and once you come to accept a particular commitment regarding them, you will feel quite satisfied at making what in retrospect was an easy choice.
If you find Manuel García's article and the work of the Swans collective
valuable, please consider helping us
Legalese
Feel free to insert a link to this work on your Web site or to disseminate its URL on your favorite lists, quoting the first paragraph or providing a summary. However, DO NOT steal, scavenge, or repost this work on the Web or any electronic media. Inlining, mirroring, and framing are expressly prohibited. Pulp re-publishing is welcome -- please contact the publisher. This material is copyrighted, © Manuel García, Jr. 2012. All rights reserved.
Have your say
Do you wish to share your opinion? We invite your comments. E-mail the Editor. Please include your full name, address and phone number (the city, state/country where you reside is paramount information). When/if we publish your opinion we will only include your name, city, state, and country.
About the Author
Manuel García, Jr. on Swans. He is a native of the upper upper west side barrio of the 1950s near Riverside Park in Manhattan, New York City, and a graduate engineering physicist who specialized in the physics of fluids and electricity. He retired from a 29 year career as an experimental physicist with the Lawrence Livermore National Laboratory, the first fifteen years of which were spent in underground nuclear testing. An avid reader with a taste for classics, and interested in the physics of nature and how natural phenomena can impact human activity, he has long been interested in non-fiction writing with a problem-solving purpose. García loves music and studies it, and his non-technical thinking is heavily influenced by Buddhist and Jungian ideas. A father of both grown children and a school-age daughter, today García occupies himself primarily with managing his household and his young daughter's many educational activities. García's political writings are left wing and, along with his essays on science-and-society, they have appeared in a number of smaller Internet magazines since 2003, including Swans. Please visit his personal Blog at manuelgarciajr.wordpress.com. (back)
Notes
1. Richard C. Jeffrey, The Logic of Decision, 1965, McGraw-Hill Book Company. (back)
2. The probability that a choice will be made is the sum of the probabilities that either alternative will be selected, so: P + (1-P) = 1 (back)
3. The symbolic expression A>B states "A is greater than B." The symbolic expression A<B states "A is less than B." The "greater than" symbol, >, and the "less than" symbol, <, are defined from these descriptions of their use. (back)
4. E(I confess) = 0 at P=0, and E(I confess) = -5 at P=1. (back)
5. E(I stay mum) = -1 at P=0, and E(I stay mum) = -10 at P=1. (back)
6. One year in prison for both men is seen as most equitably desirable, since the freedom of one purchased with the betrayal of the other, which would seem most desirable for one of them, may actually carry significant costs later, perhaps as a ruined reputation ending the ability to recruit confederates for new capers, and liability to vengeance. (back)