Question text is in black, solutions in blue.
Q1: 10 points Q2: 10 points Q3: 10 points Q4: 10 points Q5: 30 points Q6: 30 points Total: 100 points
TRUE. For each of the first five positions, each card has an equal chance to be in that position after the shuffling, and so there is a 26/52 = 1/2 chance that the card in that position is red. It is true that the conditional probability of the second card being red is 25/51 if the first card is red and 26/51 if it is black. But the total probability of the second card being red, for example, is Pr(C1 = red)*Pr(C2 = red | C1 = red) + Pr(C1 = black)*Pr(C2 = red | C1 = black) = (1/2)(25/51) + (1/2)(26/51) = 1/2.
TRUE. Here the probability can be determined as a product of conditional probabilities, Pr(0) * Pr(00 | 0) * Pr(000 | 00) * Pr(0000 | 000) * Pr(00000 | 0000) = (26/52)*(25/51)*(24/50)*(23/49)*(22/48). Since the first factor in this product is equal to 1/2 and the other four are each less than 1/2, the product is strictly less than (1/2)5 = 1/32.
FALSE. The low-order digit of the square depends only on the low-order digit of the original number, which is equally likely to be any of the digits from 0 through 9. The low-order digit of the square is thus equally likely to be the low-order digit of 02, 12, 22,..., 92, and thus has a 20% chance of being each of 1, 4, 6, or 9, a 10% chance of being 0 or 5, and a 0% chance of being 2, 3, 7, or 8.
FALSE. There are several ways to see this. The simplest is to note that the
same city name, such as St. Petersburg (FL) or Moscow (ID), might be on both
lists of training data. The NBC would give the same answer in both cases,
and one of these answers would be wrong. (The same phenomenom would occur
if some American city and some Russian city contained the same set of letters.)
More generally, the NBC only looks at the training instances in the
aggregate, not individually. If a particular letter occurs in exactly one
American city and all ten Russian cities, the NBC will take the evidence of
that letter as a strong indication for that American city being Russian, and
this could easily overwhelm any other evidence and lead to a wrong answer.
Let S be the event that the sample print comes from the suspect. We have that for each of these features Fi, Pr(Fi | S) = 0.7 and Pr(¬Fi | S) = 0.1, so L(Fi | S) = 7. We take the original estimate Pr(S) = 0.01, compute O(S) = 1/99, multiply these odds by the six likelihoods to get O(S|e) = 76/99 = 3432/99. Since this is greater than 99, the posterior probability is greater than 0.99 and the analyst can make the requested conclusion.
We multiply the prior odds by 7 for each feature than occurs, and by
L(¬Fi|S) = Pr(¬Fi|S)/Pr(¬Fi|¬S)
= 0.3/0/9 = 1/3 for each of the six features that does not occur.
If none of the features occur, we multiply by (1/3)6 < 1.
If exactly one feature occurs, we multiply by 7*(1/3)5 = 7/243
< 1.
If exactly two features occur, we multiply by
72*(1/3)4 = 49/81 < 1.
If exactly three features occur, we multiply by
73*(1/3)3 = 343/27 > 1.
So three or more of the six features are sufficient to increase the odds of
S.
She would need to multiply the odds computed by her first NBC by the likelihood ratios for the other 44 features, by L(Fi|S) for every Fi that occurs in the sample print, and by L(¬Fi) for every Fi that does not occur. To compute these likelihood ratios, we need, for each new feature Fi, an estimate of the probability Pr(Fi|S) that a print from the suspect has this feature, and of the probability Pr(Fi|¬S) that a random person's print has this feature.
The event tree for the punt has a root node labeled "Colts score?", with
a left child labeled "yes, Colts win" with probability p and a right child labeled "no, Patriots win"
with probability 1-p. The probability that the Patriots win is the probability
that we reach the leaf in which they win, which is 1-p.
The event tree for going for the first down has a root labeled
"First down?". Its left child is a leaf, labeled "yes, Patriots win", with probability q.
The right child of the root
is labeled "no, Colts score?" and has two children of its own. The probability
of the right child of the root is 1-q. The left child of the right child is
a leaf labeled "yes, Colts win" and has probability r if the right child of
the root is reached, or total probability (1-q)r. The right child of the right
child is labeled "no, Patriots win" and has probability 1-r if the right child
of the root is reached, or total probability (1-q)(1-r). The total probability
that the Patriots win is the sum of the total probabilities for the leaves where
they do so, or q + (1-q)(1-p).
Yes, with those assumptions going for the first down was correct. The probability of winning given a punt was 1-p = 0.7, while the probability given the first down attempt was q + (1-q)(1-r) = 0.6 + (0.4)(0.5) = 0.8.
With q = 0.6, the two winning probabilities are equal if 1 - p = 0.6 + 0.4(1-r) = 0.6 + 0.4 - 0.4r, or if p = 0.4r. If p is less than 0.4r, then punting is the better choice, and if p is greater than 0.4r, going for the first down is the better choice. Note that if Peyton Manning is so much better than an ordinary quarterback that p increases from 0.3 to 0.4, going for the first down is correct even if the Colts were certain to score from Patriots territory (if r = 1).
Last modified 21 November 2009