Q1: 10 points Q2: 10 points Q3: 10 points Q4: 10 points Q5: 10 points Q6: 10 points Q7: 40 points Q8: 20 points Q9: 30 points Total: 150 points
TRUE. The probability is the number of no-repeat sequences divided by the total number of sequences, which is 104 divided by 104, or 10*9*8*7/10000 = 0.504, greater than 1/2.
TRUE. We know that the Huffman algorithm, run on the input distribution given by the frequency of letters in the input string, will produce a tree that has the best possible expected output length for that distribution. But the expected output length on this distribution is exactly the output length of that tree on this input. So the Huffman tree is the one we want, and the algorithm (since it is well-specified and deterministic) can in principle be implemented on a Turing machine.
TRUE. The equivocation H(Y|X) is the entropy of Y when X is known. But if we know X, Y is necessarily 6 - X and has the constant distribution and an entropy of zero.
FALSE. If i and j are two different natural numbers, the strings u = ai and v = aj are Q-distinguishable, because if we let w = ci, then uw is in Q but vw is not. Thus there are infinitely many Myhill-Nerode classes for Q, and we can conclude that Q has no DFA and thus (by Kleene) no regular expression.
TRUE. We could build a Turing machine to count the a's, count the c's, compare the numbers, print Y if they are equal, and print N otherwise. If we actually had to build the Turing machine, it would be easier to have it repeatedly scan the input string, looking for the first a and the first c and changing both to b's if it finds them. It would print Y if the string ever became all b's, and print N if it ever finds an a but no c, or a c but no a, on a single scan.
FALSE. If M is any Turing machine at all, the language L(M) = {w: M halts on input w} is a Turing recognizable language. Thus R is just the set of strings that are valid Turing machine descriptions. When we specify a system for encoding Turing machines as strings, we do so in a way that there is an algorithm (and thus a Turing machine) to decide whether a string is a valid description.
We get the same state set, the three former letter-moves, and three new letter-moves: (1,a,2), (1,b,2), and (1,b,3). (These result because each of the three old letter-moves start at 2 and thus may start at 1 or 2 in the new NFA.) The final state set does not change because there is no λ-path in N from the start state to a final state.
The start state is {1} which is non-final. On a, {1} goes to {2}, which is non-final. On b, {1} goes to {2,3} which is final. On a, {2} goes to itself. On b, {2} goes to {2,3}. On a, {2,3} goes to {2}. On b, {2,3} goes to itself. We have a completed DFA with three states.
We start with classes N and F -- since F has only one state we are done with it. N has two states {1} and {2}. Each goes to N on a and to F on b. So this two-class partition is the final partition and we have a minimal DFA with two states N and F, start state N, only final state F, and transition function δ(N,a) = N, δ(N,b) = F, δ(F,a) = N, and δ(F,b) = F.
We add a start state I and a final state Z to D', make state F non-final, and add λ-moves from I to N and from F to Z. We first eliminate F, which has one edge into it and two out of it. The two new edges are (N, bb*a, N) (which merges with the existing N-loop to make (N, a + bb*/sup>a, N)) and (N, bb*, Z). Finally we eliminate N to get an r.e.-NFA with one transition, labeled "(λ)(a + bb*a)*bb*". The expression (a + bb*a)* is equivalent to Σ*a, the langauge of all strings ending in a, along with λ. Thus this entire language is all strings that end in an a or are λ, followed by one or more b's. This is just the set of all strings ending in b, or Σ*b -- this is also easy to see from the DFA D'.
The first bit is equally likely to be 0 or 1, so its entropy is 1. The later bits have 1/4 probability of one value and 3/4 of the other, so their entropy is (1/4) (-log 1/4) + (3/4)(-log 3/4) = (1/4)(2) + (3/4)(2 - log 3) = 2 - 3(log 3)/4 = (about) 2 - (3/4)(1.6) = 0.8. The joint entropy of the first n bits is the sum of the entropies of the independent variables, which is 1 + (0.8)(n-1) or 0.8n + 0.2.
The first of the two bits is equally likely to be 0 or 1, and the second is equal to the first with 3/4 probability. So 00 and 11 have probability 3/8 each, and 01 and 10 have probability 1/8. The entropy is (3/8)(-log 3/8) + (3/8)(-log 3/8) + (1/8)(-log 1/8) + (1/8)(-log 1/8) or (3/4)(3 - log 3) + (1/4)(3) = 3 - 3(log 3)/4 or about 1.8.
The n two-bit letters will require an expected 1.8 bits each, for 1.8n total bits. This contrasts with the 2n bits needed to send them literally. The joint entropy of the 2n bits is only about 1.6n by (a), so there ought to be a code to send them more efficiently.
Last modified 15 May 2007