Final Exam

Directions:

• Answer the problems on the exam pages.
• There are eight problems for 120 total points plus 5 extra credit. Actual scale is A = 105, C = 70.
• If you need extra space use the back of a page.
• No books, notes, calculators, or collaboration.
• The first six questions are true/false, with five points for the correct boolean answer and up to five for a correct justification.
• When the answer to a question is a number, you may give your answer in the form of an expression using arithmetic operations, powers, falling powers, or the factorial function. Probabilities may be given as either fractions or decimals.

```  Q1: 10 points
Q2: 10 points
Q3: 10 points
Q4: 10 points
Q5: 10 points
Q6: 10 points
Q7: 40+5 points
Q8: 20 points
Total: 120+5 points
```

• Question 1 (10): True or false with justification: Zane's Noodle Bowl has adopted a new policy for customers to choose from the thirteen kinds of vegetables they may put in their soup. A customer gets exactly five helpings of vegetables, and more than one helping may be of the same kind. (So, for example, one choice would be "two helpings of bean sprouts, two helpings of pea pods, and one helping of carrots".) Then of all the ways to choose five helpings of vegetables, over 25% have five different kinds of vegetables.

• Question 2 (10): True or false with justification: Before their game on 17 May 2009, the Toronto Blue Jays had won 23 games and lost 13, and the New York Yankees had won 16 games and lost 17. If we assume that the Blue Jays win each of their games independently with probability b (and lost with probability 1 - b), and that the Yankees win each of their games independently with probability y (and lise with probability 1 - y), then we conclude from these results with confidence of 95% or more that b > 1/2, and we may not conclude with 95% confidence that y < 1/2.

• Question 3 (10): True or false with justification: Consider any two-player simultaneous-move game where Players A and B each have a choice of two options and there is a 2 × 2 matrix giving the payoff for A in each of the four possible situations. Then the optimal strategy for each player is a mixed strategy, where each option is taken with some positive probability.

• Question 4 (10): True or false with justification: Suppose I shuffle a standard 52-card deck and deal one card to you and one card to me. I offer you a bet where I will pay you \$16 if the two card have the same rank, and you will pay me \$1 if they are not. Then this bet is "actuarially fair", meaning that its expected value for each of us is 0.

• Question 5 (10): True or false with justification: Suppose I throw a single fair six-sided die three times. Let A be the event that the first two throws produce two different numbers, and let D be the event that the each of the three dice has a different number. Then Pr(D | A) is exactly twice as large as Pr(¬D | A).

• Question 6 (10): True or false with justification: In the example of Question 5, also let B be the event that the first and third throws give two different numbers, and let C be the event that the second and third throws give two different numbers. Then Pr(D) = Pr(A) + Pr(B) + Pr(C) - Pr(A ∩ B) - Pr(A ∩ C) - Pr(B ∩ C).

• Question 7 (40+5): Donna, a trombonist with the UMass Marching Band, has been assigned extra marching practice. She begins the drill facing north. Her instructor gives her a series of commands, each either L for "left face" or R for "right face". On L, she turns 90 degrees to the left 90% of the time and turns 90 degrees to the right 10% of the time. On R, she turns right 90% of the time and turns left 10% of the time. We are concerned only with the direction she is facing after such a series of commands.
• (a,5) What further assumptions do we need to model Donna's movements as a Markov Decision Process?
• (b,5) Draw a diagram of this MDP, with a state for each direction Donna might be facing.
• (c,10) For each state, compute the probability that Donna is in that state after the command sequence LLLL.
• (d,10) Estimate the probability that Donna is in each state after 100 commands, 50 of them L and 50 of them R. Jutify your answer. Would the distribution by different for a different sequence of 100 commands?
• (e,10) Assume now that after a day of this practice, Donna's performance has improved so that the next day she turns correctly for an L command 95% of the time and for an R command 98% of the time. (She still turns in the opposite direction whenever she does not turn correctly.) For this new MDP, define a reward function for the instructor, so that he receives one point if Donna is facing in the direction she should be facing if she had responded correctly to all the commands she has been given, and no points if she is facing in some other direction. What policy should the instructor use to maximize his long-term expected reward? Justify your answer, but note that you need not (yet) calculate his expected reward for this policy.
• (f,5XC) Calculate the average reward per turn to the instructor in the steady state of the Markov chain arising from the optimal policy you found in part (e).

• Question 8 (20): The police believe that a wanted suspect may be in a particular building, and they station detectives in four different locations to watch for him. They estimate that if the suspect is there (event S), each detective has a 40% chance of seeing movement inside the building. If the suspect is not there (event ¬S), each has a 10% chance of seeing movement. We first assume that the reports of the four detectives are conditionally independent with respect to S.
• (a,5) If R is the event that a particular single detective reports movement, calculate the likelihood ratios L(R | S) and L(¬R | S).
• (b,5) If one detective reports movement and the other three do not, does this make it more or less likely that the suspect is there?
• (c,5) Suppose that the initial estimate of the probability of the suspect's presence is 1%, and that the police only want to enter the building if it is more likely than not that he is there. How many of the four reports must be positive before this is true?
• (d,5) Explain what it means for the reports of the four detectives to be conditionally independent. Give an example of a situation where the probabilities given are correct, but the conditional independence assumption is not valid.