CMPSCI 401: Theory of Computation

Final Exam Solutions, Spring 2010

David Mix Barrington

19 May 2010

Directions:

Answer the problems on the exam pages.
There are eight problems for 125 total points. Actual scale is A = 102, C = 65.
If you need extra space use the back of a page.
No books, notes, calculators, or collaboration.
The first four questions are true/false, with five points for the correct boolean answer and up to five for a correct justification of your answer -- a proof, counterexample, quotation from the book or from lecture, etc. -- note that there is no reason not to guess if you don't know.

  Q1: 10 points
  Q2: 10 points
  Q3: 10 points
  Q4: 10 points
  Q5: 10 points
  Q6: 30 points
  Q7: 20 points
  Q8: 25 points

  Total: 125 points

The following formal languages are each used in one or more problems:

TWO-CLIQUE is the set of all pairs (G, k) where G is an undirected graph, k is a positive integer, and G includes two sets of vertices A and B such that (1) A ∩ B = ∅, (2) A and B each have exactly k vertices, and (3) both A and B are cliques. (A clique is a set of vertices where there is an edge between every pair of distinct vertices in the set.)
NP-PATH is the set of all triples (G, s, t) where G is a directed graph, s and t are vertices in G, and it is possible for a nondeterministic Turing machine to choose (guess) a path from s to t in G in polynomial time.
A_NFA is the set of pairs (N, w) where N is an NFA, w is a string over the alphabet of N, and w ∈ L(N).
A_LINSPACE is the set of triples (M, x, 1^s) such that M is a one-tape deterministic Turing machine (with tape alphabet {0, 1, blank}), x is a string over the input alphabet of M, and M accepts x using at most s cells of space. Note that here "1^s" denotes a string of s ones, so that the length of this string is s.
UNARY-PATH is the set of all strings of the form
a¹b^i₁a^i₁b^i₂a^i₂b^i₃...a^i_k-1b^i_ka^i_kb²
where k, i₁, i₂, ..., i_k are all positive integers.

Question 1 (10): True or false with justification: Assume that P ≠ NP. Then the language TWO-CLIQUE, defined above, is NP-complete.
TRUE. TWO-CLIQUE is in NP because the pair of sets A and B form the certificate. Given G, k, A, and B, it is easy to check the three conditions on A and B in polynomial time.
There are at least two good ways to reduce CLIQUE to TWO-CLIQUE in poly-time. Given a graph G and a number k, we could map (G, k) to (G', k) where G' is a graph made from G by adding k new vertices that have edges to each other and to no vertices of G. Alternatively, we could map to (G'', k) where G'' consists of two copies of G with no edges from one copy to another. In either case the new graph has two disjoint k-cliques if and only if G has at least one k-clique.
Since TWO-CLIQUE is in NP and a known NP-complete language reduces to it, it is NP-complete.
Question 2 (10): True or false with justification: Assume that P ≠ NP. Then the language NP-PATH, defined above, is NP-complete.
FALSE. NP-PATH is exactly the same language we called PATH in lecture, since an NDTM can guess a path in polynomial time if and only if such a path exists. We know that the language PATH is in the class P (for example, we can use depth-first search to test whether a path exists, using deterministic polynomial time). If any language is both in P and is NP-complete, it would follow that P = NP, so our assumption says that this cannot happen and thus that NP-PATH cannot be NP-complete.
Question 3 (10): True or false with justification: There exists a regular language that is not in the class DSPACE(1). (Note added during test: The read-only input tape of a Turing machine has detectable endmarkers at each end.)
FALSE. If X is any regular language, it is the language of some DFA, and we can build a TM to simulate the action of this DFA with a read-only input tape and no worktapes at all. The TM reads the input left to right, keeping track of the DFA state, until it reaches the end of the input and then accepts or rejects based on the DFA state it finishes in. Thus X is in DSPACE(1).
Question 4 (10): True or false with justification: The language A_NFA, defined in Sipser and above, is in the class DSPACE(log² n).
TRUE. The language A_NFA is in the class NL = NSPACE(log n), because a TM can successively guess states of the NFA and verify that each new state is a possible successor of the previous state given the input string and the transition relation of the NFA. It rejects if it ever guesses a state that is not a successor, and accepts if and only if it finishes the input in a final state of the NFA. Read/write space O(log n) is enough to keep the previous and current states.
By Savitch's Theorem, NL is contained in DSPACE(log₂ n) so A_NFA is in this class.
Question 5 (10): Prove that the language A_NFA can be decided by a family of boolean circuits with polynomial size, fan-in two, and depth O(log² n).
Let G(N,x) be the configuration graph of the NFA N on the input x, which has some polynomial number p(n) of nodes. By definition, (N, x) is in A_NFA if and only there is a path from the start configuration to an accepting configuration in G(N,x), of length at most p(n). We build an AC¹ circuit to test existence of this path using the Savitch construction, then convert this to the desired NC² circuit.
We have a gate g_u,v,k in our circuit for every pair of configurations u and v and every number k ≤ log(p(n)). We want g_u,v,k to evaluate to TRUE if and only if there is a path of length at most 2_k from u to v in G. We set up edges to make g_u,v,k the OR, for all configurations z, of the AND of g_u,z,k-1 and g_z,v,k-1. We establish g_u,v,0 to be either a constant or an input gate so that it is true if and only if either u = v or there is an edge from u to v in the graph (this may depend on one of the input letters). The output gate of our circuit will be g_s,t,h where s is the start configuration, t is the accept configuration, and h is the ceiling of the log of p(n). The depth of the circuit is O(h) = O(log n), the size is O(p(n)²log(p(n))) which is polynomial in n, and the fan-in is at most p(n). By replacing all the p(n)-way OR gates with binary trees of binary OR gates, we multiply the depth by O(log n) but create a circuit of fan-in two that is still polynomial size.
Question 6 (30): Let M be a fixed (ordinary deterministic) Turing machine that always halts on any input.
- (a,10) Let f_M be the function from positive integers to positive integers defined so that f(n) is the maximum space used by M on any input of length exactly n. Prove that f_M is a Turing computable function.
  Given input n, our TM can simulate M on each input of size n, record the space used each time, and output the maximum of these cⁿ numbers. Each of the cⁿ individual computations must halt by the hypothesis on M, so this overall computation eventually halts and clearly gives the right answer.
- (b,10) Prove that the language A_LINSPACE, defined above, can be decided by a linear bounded automaton (LBA). (Recall that an LBA is a one-tape Turing machine that never moves its tape head to the right past the the space originally occupied by its input.) (Notes added during test: (1) An LBA knows where the ends of its tape are when it reaches them. (2) You may assume that s ≥ |x| for any triple (M, x, 1^s) in A_LINSPACE.)
  The main idea of this algorithm is simple, but there are complications that I did not hold you responsible for on the test. On input (M, x, 1^s), we want to use the part of the tape holding the s 1's to simulate the single read-write tape of M, so we overwrite the 1's with x followed by enough blanks to make s cells. Then we simulate M's operation on x using this space, rejecting if M attempts to exceed that space, and accept our input if M accepts x.
  The first complication is that our single LBA that decides A_LINSPACE must have a fixed tape alphabet. This is why I added the condition in green above to the definition of A_LINSPACE -- something like this is necessary for the LBA to exist. (We could also say that s is the number of bits used rather than the number of cells used, for example.)
  A second complication is that to have the LBA decide the language, we must detect the possibility that M is in a loop as it operates on x. We do this by giving our LBA a clock which it can keep in the s cells by expanding its tape alphabet.
  A third complication is that we must record the state of M as we simulate M, and even if M's alphabet is fixed this cannot be done in O(1) cells of the LBA's tape, because M is part of the input and might have up to O(m) states, where m is the length of (M, x, 1^s). Our LBA needs to record the current state of M using the part of its tape that originally held the description of M -- this is certainly big enough.
- (c,10) Parts (a) and (b) could be combined to show that any Turing decidable language is mapping reducible to the language of an LBA. Show that this result would not be so impressive, as follows. Prove that if X is any Turing decidable language, then there exists a regular language R such that X ≤_m R. (Recall that "≤_m" denotes the relation called "mapping reducible" in Sipser.)
  Let R be the regular language {1}. Our function f takes the input w, decides whether w ∈ X, then outputs 1 if it is and 0 if it isn't. This is an always-halting function since there is an always-halting TM that decides X. We have that w ∈ X iff f(w) ∈ R, so we have shown that X ≤_m R.
Question 7 (20): Both parts of this question involve the language UNARY-PATH defined above.
- (a,10) Is UNARY-PATH a regular language? Prove your answer.
  UNARY-PATH is not a regular language. This is easy to see using the Myhill-Nerode Theorem: if j and k are any two different positive integers, the strings ab^j and ab^k are distinguishable, because the first followed by a^jbb is in UNARY-PARTH, while the second followed by that string is not.
  Most of you tried to do this with the Pumping Lemma and most of those got it slightly wrong. Letting p be the pumping length and starting with the string w = ab^pa^pbb, for example, the regular language pumping lemma tells us that w = xyz for some x, y, and z with |xy| ≤ p, |y| > 0, and for all i, xyⁱz ∈ UNARY-PATH. If y is a string of all b's, we have a contradiction. But if x is empty and y = ab, pumping up does not give a contradiction -- extra ab strings at the beginning still fit the definition of the language. You need to note that pumping down in this case gives you a string starting with b, which is clearly not in the language.
- (b,10) Is UNARY-PATH a context-free language? Prove your answer.
  It is a context-free language, which you may prove either by giving a grammar for it or describing a PDA for it. One grammar that works is S --> aTbb, T --> TT, T --> bUa, U --> bUa, U --> ε. Several of you gave slightly wrong grammars that generated the string abb, which is not in UNARY-PATH because the definition requires the number k to be positive.
  A PDA for UNARY-PATH can (1) read the initial a, (2) one or more times read a string of b's, push them onto the stack, then match them with following a's, and (3) read the final two b's.
Question 8 (30): A boustrephedonic Turing machine is a single-tape nondeterministic TM that (1) begins its computation by placing an endmarker at the left end of its tape, (2) then travels right without changing the tape until it nondeterministically chooses to place an endmarker on a blank cell somewhere to the right of the input, and (3) then travels determinisitically between the endmarkers until or unless it halts, never moving the endmarkers and changing direction only at an endmarker. (A boustrephedon is a type of bidirectional text, common in ancient Greek inscriptions, where lines are written alternately left-to-right and right-to-left. The name comes from the Greek for "turning like an ox" as it plows a field.)
- (a,10) Prove that every Turing-decidable language X is the language of some boustrephedonic TM that always halts once it has placed its right endmarker. (Note that the BTM is nondeterministic, so you must prove that your BTM can accept w if and only if w ∈ X.)
  Let X = L(M) where M is a single-tape deterministic TM that always halts. (We rely on the result that multitape TM's may be simulated by single-tape TM's, with halting computations simulated by halting computations.) The main idea is that the BTM guesses where to put its right endmarker and then simulates M until or unless it halts or attempts to exceed the guessed space bound. If M accepts w, it is possible for this BTM to accept w if it guesses a large enough space bound. If M rejects w, then the BTM will also reject w, because it will either complete the simulation of M and discover the rejection, or it will reject when M attempts to exceed the space bound. By hypothesis, M cannot loop. (The version of the problem given on the exam, without the green text above, was actually incorrect because there is nothing to stop the BTM from moving right forever and never placing its right endmarker, so we can't force the BTM to always halt.)
  It remains to explain how the BTM simulates a space-bounded single-tape TM. It marks the location of M's head, and when it reaches this location (going either right or left) it reads the input letter and determines what M should do based on its state, the input letter read, and M's transition function. It changes the letter under the head, and thus needs only to move the head by taking the mark off of this cell and marking the cell either to the left or to the right. It can do this easily if it is going in the correct direction (the direction that M wants to go), but otherwise it must mark the location with a different mark, continue the way it was going until it hits the endmarker, turn around, return to the marked location, and make the head move. This simulation will always halt when simulating a halting computation, since every step of M is implemented with finitely many steps of the BTM.
- (b,15) Let P_bous be the set of all languages of boustrephedonic TM's B such that for any string w of length n, if B can accept w at all it can accept w within time n^O(1). Prove that P_bous = P. Note that since the BTM's are nondeterministic, the inclusion P_bous ⊆ P is not trivial. It may help to reuse some of your argument from part (a).
  First we prove that P_bous is contained in P. Given a BTM B with the polynomial-halting property, let p(n) be the polynomial time bound from the property. Our deterministic machine to simulate B will compute p(n) (non-boustrephedonically) and then for every number i from n through p(n), simulate B with the endmarker placed in location i. (You might think it would be enough to simulate B once with space bound p(n), but we can't rule out the possibility that B might behave differently for different space bounds.) If the input w is in L(B), some possible computation of B accepts and by the polynomial-halting property, at least one of these computations will accept. (There is an accepting computation that uses at most p(n) time, and the space bound of this computation must be between n and p(n).) If w is not in L(B), none of them can accept because each represents a possible valid computation of B on x.
  It remains to prove that P is contained in P_bous. Given a language X in P, let M be a single-tape machine that decides X in time p(n). (We use the fact that the simulation of multitape TM's by single-tape TM's has polynomial time overhead.) Now we simulate M by a BTM as in part (a), rejecting if M attempts to exceed the guessed space bound. If w is in X, it is possible for B to accept w by guessing a large enough space bound, and if w is not in X no possible computation of the BTM can accept. If there is any accepting computation of M on w, then a space bound of p(n) is sufficient to simulate it. The BTM simulation of this computation with this space bound uses time O(p²(n)), which is polynomial in n, so our BTM has the required polynomial-halting property.
Last modified 20 May 2010