# CMPSCI 601: Theory of Computation

### (Exam was 19 May 2003 on-campus, by 28 May 2003 off-campus.)

Directions: Question 1 consists of five statements to be marked "true", "false", or "unknown", with no justification needed. Here "unknown" means "not resolvable by the results in this course". No justification is needed or wanted (though of course I have provided one in these solutions), and there is no penalty for a wrong guess.

Questions 2 and 3 each consist of a true statement that you are to prove. As these statements were proved in lecture, you may not simply quote their proofs -- give an explanation in your own words of why they are true that demonstrates your understanding.

Questions 4 through 8 are similar to those on the midterm. There is a statement which is either true or false. You will get five points for a correct boolean answer, and there is no penalty for guessing. Then JUSTIFY YOUR ANSWER -- the five remaining points will be awarded based on the degree to which your answer is correct and convincing.

Altogether there are eight questions for 100 points.

• Question 1 (15): For each statement, indicate (no justification needed) whether it is true, false, or unknown given the results presented in the course.

• (a,3) The language REACH is in both the classes NP and co-NP.

TRUE. REACH is in NL and therefore in P. We know that P is contained in both NP and in co-NP.

• (b,3) (on-campus) The class co-NP is contained in P.

UNKNOWN. This is true iff P = NP.

(off-campus) Due to my error, off-campus students got a question repeated from the practice exam:

Every context-free language is recursively enumerable (r.e.).

TRUE. Every context-free language is recursive, since we can test membership in it in P. All recursive languages are also recursively enumerable.

• (c,3) The class ATIME(log n) is contained in DSPACE(log n).

TRUE. The simulation of the game tree of the log-time ATM game can be carried out using only O(log n) space.

• (d,3) The class DSPACE(log n) is contained in ATIME(log n).

UNKNOWN. The Alternation Theorem tells us only that DSPACE(log n) is contained in ATIME(log2 n). But we know of no reason that the statement has to be false. It is false, for example, if NC1 = L. (I gave one point for the answer "FALSE".)

• (e,3) The class NC3 is contained in the class AC1.

UNKNOWN. The reverse containment is true, but this one is unknown. It is possible that the whole NC hierarchy collapses to NC1. (I gave one point for the answer "FALSE".)

• Question 2 (15): Prove that the problem REACH is the language of an alternating Turing machine that uses O(log n) space, O(log2 n) time, and O(log n) alternations. (You may describe the machine either in terms of existential and universal states, or in terms of the game semantics.)

We use the game semantics. Our ATM gets its input -- the graph G in the form of an adjacency matrix and the vertices s and t given as numbers, on its read-only input tape.

At any given time there will be a current game with parameters u, v, and d. White will win the current game if there is a path from vertex u to vertex v of length at most 2d. The initial game had parameters s, t, and the ceiling of log n, all read deterministically from the input. A round of the game consists of White naming a vertex w and Black playing a bit. If the current game is (u,v,d), the new game is either (u,w,d-1) if Black plays 0 or (w,v,d-1) if Black plays 1. If d=0 then White wins the game (u,v,0) iff either u=v or there is an edge from u to v in G. Thus games with d=0 can be resolved deterministically in time O(log n).

We must argue that White has a winning strategy in each of these games iff the claimed path actually exists. If the path does exist, White's winning move is to name its actual middle vertex as w. Then whichever move Black makes, the newly claimed path will exist. Successively doing this will lead to a true claim with d=0, which will be resolved as true. If the path does not exist, then whatever node White chooses as w, at least one of the two claimed subpaths will fail to exist and Black can make a winning move by choosing it. This will lead to a d=0 situation where White's claim is false, and this claim will be resolved as false.

We must analyze the resources used by this ATM. The space needed is O(log n) because the ATM stores only three vertex names and the number d at any one time. (Older values of u and v are overwritten, reusing the space.) The number of alternations is O(1) per round as a round consists of one White move and one Black move. Since there are O(log n) rounds, we have O(log n) alternations. The time needed for each round is O(log n) because White needs this to write down a vertex name and there is some deterministic copying of vertex names. Thus the total time for the O(log n) rounds is O(log2 n) steps.

• Question 3 (20): Prove that L-uniform NC1 is contained in L. Recall that "L-uniform" means that there is a logspace Turing machine that can input the string 1n and output a description of the circuit Cn.

Let A be a language in L-uniform NC1. We describe a log-space algorithm to input a string w and determine whether w is in A.

We are given that a log-space function f exists taking 1n, where n is the length of w, to the circuit Cn. If we can build a log-space function g that inputs the pair (w,Cn) and outputs a bit telling whether w is in A, then the composition of the given function f with g is the desired log-space decision procedure for A. We know that the composition of two log-space functions is computable in log-space, by the method of recomputing any intermediate bit when it is needed.

So we may assume that we have w and Cn available and want to determine in log-space whether Cn(x) = 1. Now we can evaluate Cn on input w by a recursive algorithm that evaluates one node of the circuit each time it is called. This algorithm will consult the circuit to determine the left and right children of the current node, recursively evaluate those children, and compute the current node's value. If the current node is an input node, the algorithm will evaluate it by looking up the appropriate node of w.

Clearly a single call of this algorithm can be implemented in log space as it needs to store at most three node numbers and node numbers are O(log n) bits each because the circuit's size is polynomial in n. But to implement the recursion in O(log n) space we must be more careful. The recursion depth is O(log n), since each call moves down one edge of the circuit and the circuit's depth is O(log n), so we can afford only O(1) bits in each activation record. We thus cannot store the name of the current node in the activation record. But we can store whether we went left or right and what values we have already computed at each level in O(1) bits per level. Then the entire contents of the method stack will give us a path from the root of the circuit to the current node, and a log-space algorithm can follow that path to the current node whenever necessary, referring to the circuit for each edge. Thus the total space usage can be made to be O(log n).

• Question 4 (10): (true/false with justification)

If M is any one-tape Turing machine, the language VALCOMPM is in L. Here VALCOMPM is the set of strings that denote accepting computations of M on some input.

TRUE. We need to show that a log-space machine can input a string w and determine whether it is a member of VALCOMPM. It needs to check that:

1. The string w is a sequence of ID's of M, properly formatted,
2. The first ID is for a start configuration with some input string x,
3. Each subsequent ID follows from the one before it by the rules of M, and
4. The final ID is for an accepting configuration.

Each of these properties can be checked by a machine using O(log n) space, where n is the length of w. Property 1 is actually a regular language, as it requires only that each letter be in the correct alphabet and there is exactly one head position between each two ID separators. Property 2 is a matter of checking the head position and state, which is very easy. Property 3 is where the O(log n) space is actually used, because it requires comparing each character in each ID with the matching character of the previous ID, and this involves counting, possibly up to n. Property 4 is also very easy, just checking the head position, state, and first character of the last ID.

• Question 5 (10): (true/false with justification)

If a language A is in NP, then A ≤ QSAT, where "≤" denotes L-reducibility.

TRUE. There are at least two good simple proofs:

1. Since NP is contained in PSPACE, A is in PSPACE. We know that QSAT is PSPACE-complete, which implies that any language in PSPACE is reducible to QSAT. So A is reducible to QSAT.
2. Since A is in NP and SAT is NP-complete, we know that A is reducible to SAT. But SAT is reducible to QSAT, by a reduction that takes a boolean formula φ and just inserts an existential quantifier "∃ x" for every variable x occurring in φ. Then the quantified formula is clearly true (in QSAT) iff the original formula is in SAT, by the two definitions. Since A ≤ SAT and SAT ≤ QSAT, we have that A ≤ QSAT by the transitivity of logspace reduction.

• Question 6 (10): (true/false with justification)

Every language in L is context-free.

FALSE. We know (by application of the CFL Pumping Lemma) that the language ABC = {anbncn: n ≥ 0} is not context-free. But this language is in L. On input w, a logspace machine can first sweep w to confirm that it is in the language a*b*c*. If it is, the machine can count the a's, b's, and c's using O(log n) space, and accept only if all three counts are the same.

• Question 7 (10): (true/false with justification)

Unless P = NP, there is no poly-time algorithm that approximates that optimal solution to BIN-PACKING within a factor of three (i.e., uses at most three times the optimal number of bins).

FALSE. On HW#7, solving a problem from [P], we showed that the first-fit algorithm left all but perhaps its last bin at least half-full. If x is the space occupied in these at-least-half-full bins, the algorithm has thus used at most 2x+1 bins. Clearly the optimal number of bins must be at least x, so this algorithm has come within the factor of three of the optimal result. And the first-fit algorithm is poly-time without any unproven assumptions, since it need only make a single pass through the bins to see which is the first that allows each new element to fit. (Thus it places n elements making only O(n2) comparisons.)

• Question 8 (10): (true/false with justification) If a predicate R(x,y) is recursive, where x and y are strings, then the predicate ∃y:R(x,y) must also be recursive.

FALSE. To prove this false, we must give an example of a predicate R that it recursive but for which the second predicate is not recursive. I was disappointed that most people argued, in terms of "the machine deciding R", that a particular decision procedure for ∃y:R(x,y) would not halt. This is incorrect because it ignores the possibility that the second predicate might be decided by some other machine that always halts.

An example is easy to construct. Let R(x,y) be true iff "y is a description of a valid halting computation of Mx on input x". This predicate is clearly recursive by an argument similar to that of Question 4 above. But ∃y:R(x,y) is true iff x is in the language K, which we know is not recursive.