Question text is in black, solutions are in blue.
Q1: 10 points Q2: 10 points Q3: 10 points Q4: 10 points Q5: 30 points Q6: 30 points Total: 100 points
FALSE. Consider X = Y = {0,1} and let the channel always reproduce the input as the output. If X always sends 0, then H(X) = H(Y) = H(X,Y) = 0 because there is only one possible input that occurs with probability 1 and so -log Pr(X=a) is always 0. Thus I(X,Y) = 0 + 0 - 0 = 1. But if X sends 0 or 1 each with probability 1/2, then H(X) = H(Y) = H(X,Y) = 1 (since -log Pr(X=a), etc., are always 1), and thus I(X,Y) = 1 + 1 = 1 = 1. The mutual information is different for the different distributions on X.
TRUE. Proof by contrapositive -- if the code were not able to detect up to 2t errors per block, there would have to be two code words within Hamming distance 2t of each other, so that one code word could be transformed into another by at most 2t bit errors. But if this is the case, where x and y are code words at most 2t apart, there must be a single word z that is at most distance t from both x and y (take x, for example, and form z by changing t of the bits that are diffferent in y.) If a recipient gets z and knows that there are at most t errors, they still cannot tell whether x or y is the correct word sent.
TRUE. H(X,Y) must be at least as great as H(X), because adding the information
from Y cannot increase the uncertainty about X. (More formally, the
expected value of log (1/Pr(x,y)) must be at least as great as the expected
value of log (1/Pr(x)) because Pr(x,y) can never be greater than Pr(x).)
Similarly, H(X,Y) must be greater than or equal to H(Y). Adding these two
inequalities, we get that 2H(X,Y) ≥ H(X) + H(Y), and subtracting H(X,Y)
from both sides gives us H(X,Y) ≥ H(X) + H(Y) - H(X,Y) = I(X,Y).
Alternatively, after noting that H(X,Y) ≥ H(X) we can recall that we
proved H(X) ≥ I(X,Y) using Jensen's Inequality, and our desired inequality
follows by transitivity.
FALSE. By sending digits in blocks we can save bits over sending each digit separately. For example, we could use two-digit blocks and send each block with seven bits because 100 < 128 = 27. This allows us to send n digits in 7n/2 < 4n bits. Similarly, we could send blocks of three digits using ten bits each (since 1000 < 1024 = 210), taking 10n/3 bits to send n digits. By using larger and larger blocks, we could get arbitrarily close to (log 10)n = 3.32n bits, since a source where each bit is equally likely has an entropy of log 10.
We just pick a two-bit sequence for each of the four letters, and send the n letters in 2n bits.
We have four letters of weights 3/8, 3/8, 1/8, and 1/8. We first combine C and G into a group of total weight 2/8. Then we combine this group with, say, T to get a group of total weight 5/8. Finally we combine this group with A to get a group of total weight 1. The eventual code might have A = 0, C = 100, G = 101, and T = 11 (other specific codes are possible). The average length of a code word, and hence the expected number of bits needed to send a letter, is 1*(3/8) + 2*(3/8) + 3*(2/8) = 15/8 = 1.875.
The entropy, by the definition, is (3/8)log(8/3) + (3/8)log(8/3) + (1/8)log(8) + (1/8)log(8) = (3/4)(3 - log 3) + 3/4 = (3/4)(4 - 1.585) = 1.812. (The given estimate of 1.6 for log(3) yields an answer of 1.8.)
As n and k increase, the average number of bits needed approaches the entropy from above. Thus the total number of bits needed approaches 1.812n from above.
There are 24 = 16 total strings of length 4. Two of them (0000 and 1111) are in L(R). Five (0000, 0001, 0011, 0111, and 1111) are in L(S). Four (0000, 0011, 1100, and 1111) are in L(T).
For any positive n there are exactly two length-n strings in L(R), 0n
and 1n.
There are n+1 length-n strings in L(S), because the number
of 0's can be any integer from 0 through n.
Finally there are 2n/2
length-n strings in L(T), because such a string consists of n/2 substrings,
each of the form 00 or 11.
We thus need one bit to specify a string in L(R), log(n+1) bits (rounded up)
to specify a string in L(S), and n/2 bits (rounded up) to specify a string in
L(T).
A linear code or subspace is a nonempty set of strings such that for any two
strings x and y in the set, the string x+y (the bitwise XOR of x and y) is
also in the set.
The n-length strings of L(R) form
a code because adding two equal strings gives
0n, and adding two unequal strings gives 1n.
The n-length strings of L(S) do not form
a code for n ≥ 2, because the sum of 01n-1 and
1n is 10n-1, which is not in L(S). (For n=1 the L(S)
strings are actually
is equal to the L(R) strings and thus form a code.)
The n-length strings of L(T) do not form a code for odd n because the set is
empty. But for even n they do form a
code. Let x and y be two strings in the set. Then each of the n/2
two-letter segments of x+y is the sum of two elements
from {00,11} and is thus also in this set -- hence x+y is in L(T).
Last modified 23 April 2007