CMPSCI 240: Reasoning About Uncertainty

David Mix Barrington

Fall, 2009

Questions and Answers on Homework Assignment #8

HW#8 is due on paper either in class or to the CMPSCI main office by 4:00 p.m., Friday 11 December 2009.

Question text is in black, my answers in blue.

Question 8.1, posted 9 December: Even with the correction, I still don't think 12.5.5 matches what happens on Who Wants to Be a Millionaire?, and I'm wondering if you've still got it wrong. On the show, if the contestant misses one of the first five questions, they go home with no money at all. But you have them getting "2ⁱ thousand or 32 thousand, whichever is smaller". I buy that for i ≥ 5, because once they get to 32 thousand they get to keep it. But shouldn't it be just 0 if they miss one of the first five? In your version they have no incentive to quit in the first five questions.
You are right. What I meant was that if they quit, they get 2ⁱ, and if they miss a question, they get 0 if i < 5 and 32 if i ≥ 5. (Here "i" is the number of questions they answer correctly.) I'm very sorry for the delay in fixing this (you can blame the flu), and we will accept answers for either the version described in the correction the HW#7 assignment or the (more interesting) one described here.
Note that if p = 0, she should take her $1000 from the start and not risk any questions, and if p = 1 she should always try the next question, since she will get all ten right and win $1,024,000. So as p varies, the correct strategy varies. Start with p = 0.5 and see what she should do, then see how this changes for other values of p.
Note also that we are maximimizing the expected value of the player's winnings, in dollars, not the player's utility. In effect we are assuming that the player is already very wealthy, so that for example $1,024,000 is worth exactly twice as much to her as is $512,000. When Millionaire was first designed, the problem was to give enough incentives for a player to be willing to risk a half-million to get a million -- hence the phone-a-friend, the 50-50, the poll-the-audience, and the chance to see the question before deciding whether to try. In addition, on the real show the questions get harder as you move on.
This guy was actually one of my roommates in my first year of grad school at MIT, though we have not been in touch since well before he became famous.
Question 8.2, posted 9 December: This is actually about the project rather than the homework. You ask us to calculate the steady-state probability by taking the vector v = [1,0,0,...,0] and multiplying it by M^t for some large t, large enough that we are close to the steady state. I thought of a different idea that should be faster -- take v and keep assigning vM to M until it stops changing. If n is the number of states, I'm only doing O(n²) operations each time instead of O(n³), and I'm still getting the right answer, right? Is it ok if I do this?
Sure, the idea was to compute the steady state by both Monte Carlo simulation and by matrix multiplication, and you are certainly doing the latter. Whether your method is faster depends on the relationship between t and n. You have to do y matrix-vector multiplications, but I can get away with only log(t) matrix multiplications if I use repeated squaring. So it is really a matter of O(n²t) versus O(n³log(t)), either of which might be better in some circumstances.

Last modified 9 December 2009