CMPSCI 601: Theory of Computation

Final Exam Solutions, Spring 2010

David Mix Barrington

Exam given 6 May 2010

Solutions posted 19 May 2010

Directions:

Answer the problems on the exam pages.
There are five short problems, for ten points each, and three long problems for 25 points each. Attempt all the short problems and only two of the long ones -- the maximum score is thus 100. If you attempt all three long problems I will take the scores of the best two. Actual scale was A = 84, B = 56.
No books, notes, calculators, or collaboration.

  Q1: 10 points
  Q2: 10 points
  Q3: 10 points
  Q4: 10 points
  Q5: 10 points
  Q6: 25 points
  Q7: 25 points
  Q8: 25 points

  Total: max 100 points

Question 1 (10): An oblivious Turing machine has an input tape and k worktapes for some constant k. It has the property that the positions of the head on each tape depend on the input size, not on the input contents. That is, there are functions p_I(n,t), p₁(n,t), ..., p_k(n,t) such that p_j(n,t), for example, gives the position of tape j's head after t steps on any input of size n.
Define OBL-P to be the set of languages A such that A = L(M) for some oblivious Turing machine with a polynomial time bound. Prove carefully that OBL-P = P. (This result is in the book, of course, but you must present it rather than quote it!) (Notes added during exam: You may use without proof the result that a k-tape TM may be simulated by a one-tape TM with polynomial time overhead. "P" is defined in terms of multitape TM's.)
First, it is obvious that OBL-P is contained in P because the oblivious poly-time machine is also an ordinary multitape poly-time machine.
Let X be an arbitrary language in P, and let M be an arbitrary multitape Turing machine, running in time p(n) on all inputs of size n, such that X = L(M). Let M' be a single-tape machine equivalent to M (so that X = L(M'), running in time p'(n) on inputs of length n -- from homework we know that this M' exists where p' is a polynomial. We will build an oblivious machine O such that O = L(M') and O still runs in polynomial time.
The machine O will have three tapes -- one to simulate the tape of M' with an additional marker on the tape for the location of the simulated head of M', one counter for how many steps of M' have been simulated, and one additional counter. To simulate step i of the computation of M', O makes two passes over the M' tape, from location 1 to location i and back. (It uses the additional counter to know when to turn around.) Since the head of M' moves only one cell per step, this pass must observe the M' head on the way left and pass it again on the way back. On the way left O observes the letter under the M' head, and on the way back right it implements the step of M'. Since the step of M' may involve moving right for one cell, the entire leftward sweep is done with two steps left followed by one right each time. When O has finished the i'th step, it increments i on the M'-step counter. Clearly O will simulate M', and clearly O is oblivious because the head movements are governed solely by the number i, not even by the input length. Simulating step i of M' takes O(i) time, so the total time to simulate all p'(n) steps is (p'(n))², still a polynomial.
Question 2 (10): The language CKT-SAT is defined to be all pairs (C, x) where C is a boolean circuit of fan-in at most 2, C has n + m inputs, x is a string of length n, and there exists a string y of length m such that C(x, y) = 1. Prove that CKT-SAT is NP-complete. You may assume the NP-completeness of languages proven to be NP-complete in Chapter 2 or its exercises, but of course not the NP-completeness of CKT-SAT itself which is proved in Chapter 6. (Note added during exam: C is part of the input (C, x).)
Clearly CKT-SAT is in the class NP, because the string y can serve as a certificate for the membership of (C, x) in CKT-SAT. Given C, x, and y, checking the validity of the certificate just means evaluating C on input (x,y), which takes time polynomial in the size of C and thus polynomial in the input size.
We reduce 3-SAT to CKT-SAT. Given a 3-CNF formula φ with inputs z₁, ..., z_k, we build a circuit using two binary OR-gates to compute the value of each clause and a binary tree of binary AND gates to compute the AND of all the clauses. We let this circuit be C, let x be the empty string, and let y be the string z₁z₂...z_k, so that n = 0 and m = k. Clearly there exists a y making C(x, y) = 1 iff φ is satisfiable, and clearly the computation of (C, x) from φ can be carried out in polynomial time.
Since CKT-SAT is in NP and a known NP-complete language reduces to it, it is NP-complete.
Question 3 (10): Let A be any language in the class L = DSPACE(log n). Prove that A ∈ AC¹, meaning that there is a log-space uniform circuit family {C_n} deciding A, where the circuit C_n has size n^O(1), depth O(log n), and unbounded fan-in. Argue the log-space uniformity of the circuit family carefully, making clear that you understand the definition.
We use the Savitch middle-first search construction on the configuration graph of the log-space machine. This would work even if the machine were nondeterministic. Many people got the basic definitions wrong here -- several showed that log-space could be simulated with log width, for example.
Let M be a deterministic Turing machine using read/write space O(log n) on inputs of length n, in addition to its read-only input. A configuration of M on input x consists of the work tape contents (O(log n) bits), the machine state (O(1) bits), and the input head position (O(log n) bits), for a total of O(log n) bits. There are thus n^{O(1)} (not n, as many of you wrote) possible configurations. The graph has an edge from u to v whenever v is the configuration that follows from u by the rules of M and the content of the input x. The input is in L(M) if and only if there is a path in this graph from the start configuration to the accepting configuration. (We modify M to make the accepting configration unique.
In the Savitch construction, we have a node for every question of the form PATH(u, v, 2ⁱ), which asks whether there is a path of length at most 2ⁱ from node u to node v. We pick a number k such that 2^k is at least the number of configurations (so k = O(log n)) and note that x is in L(M) iff PATH(start, accept, 2^k. We then have to connect these nodes. The fundamental observation of middle-first search is that PATH(u, v, 2ⁱ) is true iff there exists a node w such that PATH(u, w, 2^i-1) and PATH(w, v, 2^i-1) are both true. In our circuit, then, we make the node for PATH(u, v, 2ⁱ) an OR gate, with an binary AND-gate child and two grandchildren for each w. For the nodes PATH(u, v, 1) we have constant gates with value 0 or 1 or literals x_j or ¬x_j depending on whether either u = v or there is an edge from u to v in the configuration graph -- this may depend on the input being viewed by the input read head in that configuration. The size of the circuit is polynomial because there are only polynomially many choices of u and v, O(log n) choices of i, and fewer than one "child gate" for every PATH gate. The depth is O(log n) because it is O(k).
To prove log-space uniformity we must explain how a single log-space Turing machine can produce the n-input graph given input n in unary. It can do this by cycling through all the configurations u, v, and w and all the numbers i ≤ k, building the nodes for PATH(u, v, 2ⁱ) and the associated nodes and edges. It only needs to remember three node numbers while it is doing this, which is possible in O(log n) space. Examining the relevant configurations is enough to determine whether there is an edge between any two given nodes in the circuit.
Question 4 (10): Let p be an odd prime number and let {h_a: a ∈ Z_p} be a family of hash functions from Z_p to itself, defined by the rule h_a(b) = a + b where the addition is taken modulo p. Is this a family of pairwise independent hash functions? Prove your answer. (Note added during exam: Z_p = GF(p) = "the integers modulo p", and p is fixed for the problem.)
It is not such a family. For a uniform random a, the probability that h_a(b) = x is exactly 1/p, since b plus a uniform random a is equally likely to be any number in Z_p. If the family were pairwise independent, the events that h_a(b) = x and h_a(b') = x' (for b ≠ b') would be independent, meaning that the probability that both would happen would be the product of the two individual probabilities, or 1/p². But for a uniform random a and any given b, b', x, and x' with b ≠ b', the probability for a uniform random a that both h_a(b) = x and h_a(b') = x' is either 1/p (if x - b = x' - b', so one particular a makes both events happen) or is 0 (if x - b ≠ x' - b', so that no a can make both happen). In either case the probability is not 1/p² and thus the family is not pairwise independent. A family h_a,b where h_a,b(x) = ax + b (modulo p) would be pairwise independent by exactly the same reasoning used in the book in Chapter 8 -- that reasoning only required that the linear equations be over a finite field.
Question 5 (10): Consider a quantum register of three qubits, so that a state of the register is a quantum superposition of the eight pure states |000>, |001>, ..., |111>. A Toffoli gate is a quantum operation that takes each pure state |abc> to the pure state |abd>, where d = c ⊕ (a ∧ b).
Prove that the Toffoli gate is a valid quantum operation because its matrix is unitary (i.e., it satisfies the rule AA^T = I where A^T is the transpose of A). (Hint: Find the inverse of the operation and argue from there. You can solve this problem with or without working with any specific 8 by 8 matrices.)
The Toffoli operation is a permutation of the eight pure states, so its matrix is a permutation matrix, with exactly one 1 in each row and in each column. The transpose of a permutation matrix is another permutation matrix, for the inverse permutation. Since the Toffoli permutation moves |110> to |111>, moves |111> to |110>, and keeps the other six pure states fixed, it is its own inverse and thus the matrix is also its own transpose. So if A is this matrix, AA^T = AA = I, the identity matrix, and the matrix is unitary.
What is the result of applying a Toffoli gate to a register with state that is the sum, over all a, b, and c in {0,1}, of (1/√8)|abc>? What is the probability of observing each pure state if this register is observed after the Toffoli gate is applied?
This state is taken to itself by the Toffoli operation -- each pure state is the image of exactly one pure state, so it gets the coefficient that the other state had, which in each case is 1/√8. When we observe the register, for each pure state the probability of observing it is (1/√8)² = 1/8.
What is the result of applying a Toffoli gate to a register with state (1/2)(|000> + |011> + |101> + |110>)? What is the probability of observing each pure state if this register is observed after the Toffoli gate is applied?
By applying the transformation to each component we get a new state of (1/2)(|000> + |011> + |101> + |111>). When we observe the register, we have a (1/2)² = 1/4 chance of observing each of the pure states in this latter sum, and a zero chance of observing each of the other four states.
Question 6 (25): This problem involves a hierarchy theorem for alternating time. We assume throughout that f and g are time-constructible functions, with f(n) ≥ n and g(n) ≥ n, and that alternating machines have random access to their input.
- (a,5) Briefly justify the claims that ATIME(f) ⊆ DSPACE(f) and that DSPACE(f) ⊆ ATIME(f²). (These are two of the four parts of the Alternation Theorem.)
  ATIME(f) is contained in DSPACE(f) because we can evaluate the game tree of a game played in O(f) time with a deterministic machine using O(f) space. To evaluate the winner from a configuration of the ATM, we recursively evaluate both successor configurations and then find the optimal move. This requires O(1) space to remember which successor we are evaluating, and the result of the first evaluation while we are evaluating the second. Since the depth of recursion is O(f), we need total space O(f) for our stack plus O(f) space to store one or two configurations of the ATM at a time.
  DSPACE(f), and even NSPACE(f), are contained in ATIME(f²) because we can play the Savitch game (as in Question 3) on the configuration graph of the deterministic or nondeterministic machine. Since that machine uses space O(f), it has 2^O(f) configurations and the game takes O(f) rounds. Each round involves the White player naming a configuration, taking O(f) time, and the Black player naming a bit taking O(1) time, so the total time is O(f²).
- (b,5) Use the facts from (a), and the Space Hierarchy Theorem, to prove that if f = o(g), then ATIME(f) is strictly contained in ATIME(g²).
  ATIME(f) is contained in DSPACE(f) by (a), DSPACE(f) is strictly contained in DSPACE(g) by Space Hierarchy because f = o(g), and DSPACE(g) is contained in ATIME(g²) by (a). So ATIME(f) is strictly contained in ATIME(g²).
- (c,10) Prove that with f and g as defined above, if ATIME(n) = ATIME(g), then ATIME(f) = ATIME(g º f). (Recall that (g º f)(n) is defined to be g(f(n)).)
  Let X be an arbitrary language in ATIME(g º f) -- we must show that X is also contained in ATIME(f). Let Y be the set of strings w0^f(n) for all strings w of length n in X. The language Y is in ATIME(g), because given any string of the form w0^f(n) we can extract w and then play the game to decide whether w is in X. This game takes O(g(f(n))) time, which is O(g(m)) time where m is the length of w0^f(n), the input we are testing for membership in Y. By hypothesis, Y is also in ATIME(n). But then we can decide whether w is in X with a game using time O(f(n)) -- we just form the string w0^f(n) and play the linear-time game on this new string to determine whether it is in Y. Linear time in the length of the new string is just O(f) time.
- (d,5) Use parts (b) and (c) to argue that for any constant ε > 0, ATIME(n) is strictly contained in ATIME(n^{1 + ε}).
  Assume the contrary, that for some fixed ε we have ATIME(n) = ATIME(n^1+ε). Then we can use (c) and induction to prove that ATIME(n) = ATIME(n^{(1+ε)^k} for any positive integer k, noting that if g(n) = n^1+ε and f(n) = n^{(1+ε)^k}, then g(f(n)) = n^{(1+ε)^k+1}. If we choose k greater than 1/log₂(1+&epsilon), a constant, we get (1+ε)^k > 2 and thus ATIME(n) = ATIME(n^2+δ) for some positive constant δ. Now setting f(n) = n and g(n) = n^1+δ/2, we have a contradiction to part (b).
Question 7 (25): These questions all involve the complexity class BPP. Recall that a language A is in BPP if there exists a poly-time probabilistic Turing machine M such that if x ∈ A, Pr[M(x) = 1] ≥ 2/3, and if x ∉ A, Pr[M(x) = 1] ≤ 1/3.
- (a,5) Explain why BPP is contained within alternating polynomial time. (Note added during exam: You may not use the Sipser-Gacs theorem (that BPP ⊆ Σ₂^p) without proof.)
  BPP is contained within PSPACE, because a deterministic poly-space machine can simulate a probabilistic poly-time machine on all possible random sequences, calculate the probability that the probabilistic machine will accept the input, and give its output based on whether this is ≥ 2/3 or ≤ 1/3.
  In turn, PSPACE is contained within alternating polynomial time by the Savitch argument summarized in Question 6(a). A deterministic (or even nondeterministic) machine operating in space p(n) can be simulated by an ATM game played in O(p²(n)) time, using the Savitch game to determine whether a path exists in the first machine's configuration graph.
  The Sipser-Gacs theorem does also imply that BPP is in alternating polynomial time, if you can prove it, since Σ^p₂ consists of the languages of poly-time ATM's with limits on their alternation. Its proof is considerably more complicated than the Savitch argument.
- (b,10) Explain why if A is any language in BPP, there exists a circuit family {C_n} deciding A, where the size of C_n is n^O(1). (Note that this family is not necessarily a uniform family.
  Let A be the BPP language, and let B be a deterministic poly-time machine that takes an input x (of length n) and a random sequence y and simulates the probabilistic polynomial-time machine for A, so that x is in A if and only if B(x,y) = 1 for at least a 2/3 fraction of the y's, and x is not in A if and only if B(x, y) for at most a 1/3 fraction of the y's. By amplification, we can create another poly-time machine B' so that if x is in A, B'(x, z) = 1 for at least a 1 - 2^-2n fraction of the z's, and if x is not in A, B'(x, z) for at most a 2^-2n fraction of the z's.
  For a uniformly chosen random z, the expected probability that for a random x, x is in A if and only if B'(x, z) = 1, is thus at least 1 - 2^-2n. By an averaging argument, then, at least one of these z's must have at least this probability of computing membership in A correctly, and the only way a fraction with denominator 2ⁿ (the number of possible x's) can be at least this great is to be equal to 1. Therefore there exists a z such that the machine that inputs x and returns B'(x, z) is a decider for A. By a standard construction based on the proof of the Cook-Levin theorem, we can take a deterministic poly-time machine (even one that takes polynomial advice) and create a poly-size circuit family.
- (c,5) Argue that if NP = BPP, then the polynomial hierarchy collapses. (You may combine a known result with part (b).)
  The proof I had in mind is as follows. The Karp-Lipton Theorem says that if NP is contained in P/poly, then the hierarchy collapses (to Σ₂^p). If NP and BPP are the same class, then NP is contained in P/poly because BPP is (unconditionally) contained in P/poly by part (b). So under this assumption, the hierarchy collapses.
  Many people tried to use Sipser-Gacs, claiming a collapse from the fact that NP is contained in Σ₂^p. But this containment is true unconditionally, since NP equals Σ₁^p, and does not imply any collapse.
  At least one student found a valid alternative proof that is simpler than mine. BPP is (unconditionally) closed under complement -- if M is a probabilistic poly-time machine proving that A is in BPP, then the machine that simulates M and reverses its answer proves that A's complement is in BPP. If NP = BPP, then, NP is closed under complement, and thus the classes Σ₁ and Π₁^p are equal, and this collapses the hierarchy to these classes. (You can prove by induction on i that each class Σ_i^p is equal to NP.)
- (d,5) Secure pseudorandom generators are defined in Question 8 below. Prove that if a secure PRG exists with stretch 2ⁿ, then P = BPP.
  Let A be an arbitrary language in BPP. Using a PRG with stretch 2ⁿ, we can take a seed of length k = Θ(log n) and produce a pseudorandom string of length long enough to run M, the probabilistic poly-time machine for A. By the definition of security, for any fixed x, the probabilities that this probabilistic poly-time machine accepts x for a truly uniform random sequence and for a pseudorandom sequence differ by a negligible function of k. The truly random probability for each x must be either ≥ 2/3 or ≤ 1/3, and since a negligible function of a number that is Θ(log n) is eventually smaller than any constant, the pseudorandom probability is either ≥ 3/5 or ≤ 2/5. In deterministic polynomial time, we can run the M on the pseudorandom sequences from every possible seed of length k, and calculate this latter probability exactly. If we accept whenever it is greater than 1/2, we always get the correct answer for membership in A. Hence A is in P under the assumption that this PRG exists.
Question 8 (25): Recall that a pseudorandom generator or PRG is a function taking strings of length n to strings of length s(n), where s(n) is a function called the stretch and s(n) > n for all n. A PRG is said to be secure if for any probabilistic poly-time function A, the probability that A(x) = 1, for a string x of length s(n) generated from a uniformly chosen seed of length n, differs from the probability that A(y) = 1, for a uniformly chosen random y of length s(n), by a negligible function of n.
- (a,5) Prove that no secure PRG can exist if P = NP.
  Let X be the set of strings that are produced by the PRG for some seed. X is clearly an NP language, so by the assumption it is in P. Let A be an polytime algorithm that returns 1 on w if and only if w is in X. Now consider running A on pseudorandom or on truly random strings of length s(n). The probability that it returns 1 for a pseudorandom string is 1, since all pseudorandom strings are in X. The probability that it returns 1 on a truly random string is at most 2ⁿ/2^s(n), because there are only 2ⁿ strings of length n to produce pseudorandom strings from, and this number is at most 1/2 because s(n) > n. So the probabilities differ by at least 1/2, which is certainly a non-negligible function, and thus the PRG is not secure.
- (b,10) Let the encryption scheme E be defined so that for any key k, a binary string of length n, and any plaintext string x of length L(n), E_k = x ⊕ G(k) where G is a secure PRG of stretch L(n). Prove that no probabilistic poly-time algorithm can use E_k(x) to predict any bit of x with much greater than 1/2 probability. Specifically, prove that if A is any such function, the probability (over a uniform random choice of k and any choices made by A) that A(E_k(x)) = (i,b) and x_i = b is at most 1/2 + ε(n), where ε is a negligible function.
  Fix a plaintext string x and fix any such function A. Let B be the algorithm that takes a string z of length L(n) and runs A on x ⊕ z. B is clearly a probabilistic poly-time algorithm. B on a truly random string is the same as A operating on a one-time pad encoding of x, and we proved on the homework that no possible such A can predict any bit of x with probability greater than 1/2. B on a pseudorandom string is exactly A operating on E_k(x) for a uniform random key k. By the security of the PRG, the probability that B accepts in this case differs only negligibly from the probability that B accepts in the truly random case. So the former probability, which is the probability that A predicts a bit of x correctly given E_k(x), is at most 1/2 plus a negligible function.
- (c,10) As in part (b), define E to be the encryption scheme defined from a secure PRG G of stretch L(n). Let x₀, the chosen plaintext, be some fixed string of length L(n). Also assume that L(n) ≥ 2n.
  A chosen plaintext attacker for this scheme is a probabilistic poly-time algorithm A such that if a string y is equal to E_k(x₀) for some key k Pr[A(y) = 1] is at least n^-c for some constant c, and Pr[A(y) = 1] < ε(n) if there is no such k.
  Show that there cannot exist such a chosen plaintext attacer for E. (Hint: Given a hypothetical attacker A, construct a probabilistic poly-time machine B that operates on the pseudorandom or random one-time pads, and use the assumed security of G.) (Note added during exam: You may quote results from HW #8 without proof.)
  As before, let B be the algorithm that takes input z and runs A on x₀ ⊕ z. B's probability of acceptance on a truly random z is at most 2ⁿ/2^L(n), which is at most 2^-n because L(n) ≥ 2n. If A behaved as advertised, this probability would have to be ≥ ε(n) for B on a pseudorandom string, since in the pseudorandom case the y that A operates on always is equal to E_k(x) for some k. But if the PRG is secure, the two probabilities could not differ by as much as ε(n) - 2^-n, a non-negligible function.

Last modified 19 May 2010