# Solutions to Final Exam, Spring 2012

### Directions:

• Answer the problems on the exam pages.
• There are nine problems for 125 total points. Actual scale was A = 100, C = 65.
• If you need extra space use the back of a page.
• No books, notes, calculators, or collaboration.
• The first six questions are true/false, with five points for the correct boolean answer and up to five for a correct justification of your answer -- a proof, counterexample, quotation from the book or from lecture, etc. -- note that there is no reason not to guess if you don't know.

```  Q1: 10 points
Q2: 10 points
Q3: 10 points
Q4: 10 points
Q5: 10 points
Q6: 10 points
Q7: 20 points
Q8: 15 points
Q9: 30 points
Total: 125 points
```

Here are some languages and definitions used on the exam:

• A one-way input Turing machine (OWITM) is a deterministic Turing machine that has a read-only input tape on which the head may never move left. It may have other tapes as specified, and these (if any) are normal read-write tapes with normal heads.
• An alternating finite automaton (AFA) M has the normal components of an NFA, including a transition function δ: (Q × Σ) → P(Q) (where P(Q) is the power set of Q). In addition the states are divided into "White" and "Black" states. The language L(M) is defined in terms of a two-player game as follows. If w is the input string, the two players White and Black decide what choices M will make as it reads w, with White choosing the move from White states and Black from Black states. White wins the game, and thus w ∈ L(M), if and only if she has a winning strategy that will always leave M in a final state after reading w, whatever choices Black makes.
• The language ADFA is the set of pairs (M, w) such that M is a DFA and w ∈ L(M), and ANFA is the analogous language for NFA's.
• The language TSP is the set of all pairs (G, t) where G is a directed graph with positive integer labels given in binary, and there exists a path in G, of total weight at most t, that includes each vertex exactly once.
• The language 3-TSP is the subset of TSP where each edge label is either 1, 2, or 3.
• The language EXACT-PATH is the set of all 4-tuples (G, x, y, t) such that G is a directed graph with positive integer labels, given in binary, and there exists a path from x to y in G, with total weight exactly t.
• The following languages may be assumed without proof to be NP-complete, as they were proved to be so either in Sipser or on the homework:
• ANP = {(M, w, 1t): M is a nondeterministic TM and it is possible for M to accept w in t steps}
• SAT or CNF-SAT is the set of boolean formulas in conjunctive normal form that have at least one input setting making them true.
• 3-SAT is the seth set of boolean formulas in CNF, with all clauses of size 3, that have at least one input setting making them true.
• CLIQUE = {(G, k): G is an undirected graph and there exists a set X of k vertices in G such that every pair of distinct vertices in X has an edge}
• VERTEX-COVER = {(G, k): G is an undirected graph and there exists a set X of k vertices in G such that every edge in G has at least one endpoint in X.}
• DOMINATING-SET = {(G, k): G is an undirected graph and there exists a set X of k vertices in G such that every vertex in G is either in X or has an edge to a vertex in X}
• HAM-PATH is the set of directed graphs G such that there exists a path in G that visits each vertex exactly once.
• UHAMPATH is the set of undirected graphs G such that there exists a path in G that visits each vertex exactly once.
• 3-COLOR is the set of undirected graphs G such that the vertices of G can each be assigned one of three colors, so that no edge connects two vertices of the same color.
• SUBSET-SUM is the set of sequences of positive integers (s1, ..., sm, t) such that there exists a subset X of {1, ..., m} such that the sum of si, for all i in X, equals t.

• Let S = (s1, ..., sn) be any sequence of positive integers. We define a labeled directed graph GS with 3n + 1 nodes and 4n edges as follows. The nodes are a0 together with ai, bi, and ci for each i with 1 ≤ i ≤ n. For each such i, there is an edge from ai-1 to bi with label 1 + si. All other edges have label 1 -- for each i there are edges from ai-1 to ci, from bi to ai, and from ci to ai. Here is a picture of GS where S is the sequence (6, 2, 3) -- edges with no labels are assumed to be labeled with 1:

``````

7
a0 ---> b1
|      |
|      |
V      V   3
c1 ---> a1 ---> b2
|       |
|       |
V       V   4
c2 ---> a2 ---> b3
|       |
|       |
V       V
c3 ---> a3
``````

Question text is in black, solutions in blue.

• Question 1 (10): True or false with justification: The language ANFA is poly-time reduciable to ADFA, that is, ANFAP ADFA.

TRUE. Though we cannot convert an arbitrary NFA to an equivalent DFA in poly time, since the DFA might have exponentially many states, we don't need to do that. Given input (M, w), we can determine in poly time whether (M, w) is in ANFA, then output either a fixed string in ADFA or a fixed string not in ADFA.

The reason ANFA is in P is that we can simulate the Subset Construction's DFA without building it -- for every prefix of w in turn, we compute the set of states to which M might go from i on reading that prefix.

• Question 2 (10): True or false with justification: There does not exist a context-free language X such that ADFAL X, that is, such that ADFA is log-space reducible to X.

FALSE. Let X = {1}, let the reduction compute whether (M, w) is in ADFA (using logspace to simulate M on w) and then output 1 or 0 according to the answer.

• Question 3 (10): True or false with justification: Assuming that P ≠ NP, the language ANFA is NP-complete.

FALSE. ANFA is in P as shown in Question 1. If a language in P were to be NP-complete, then P would be equal to NP.

• Question 4 (10): True or false with justification: Let h be the mapping that takes a sequence of positive integers S, given in binary, to the labeled directed graph GS as defined above, given as an adjacency list with binary labels. Then h can be computed by a log-space transducer.

TRUE. A loop for i from 1 to n can generate the nodes and the edges -- a constant number of each for each i. The elements si can be read from the input, incremented by 1, and output as the edge labels that aren't 1.

The read-write memory needed to do this is O(log n) bits for i. A log-space transducer has arbitrary access to its read-only input.

• Question 5 (10): True or false with justification: Let M be a one-take OWITM as defined above. (So M's only tape is its one-way read-only input tape.) Then L(M) is a context-free language.

TRUE. M can be simulated by a DFA, so L(M) is regular, and all regular languages are context-free.

The simulation is possible because M has only O(1) possible states in each input-head position. Given the state and the letter read, the next state is determined by the rules of the machine.

• Question 6 (10): True or false with justification: Let M be a k-tape OWITM for some positive integer k, so that k - 1 of the tapes are ordinary ones. Then there is a two-tape OWITM M' such that L(M) = L(M').

TRUE. If k ≤ 2 it is clearly true. Let M be a TM with k ≥ 3 tapes. By the construction in Sipser we know that there is a one-tape TM N with L(N) = L(M). Let M' be a two-tape OWITM that copies its input tape onto its second tape, then runs N on that tape. Then clearly L(M') = L(N) and so L(M') = L(M) as desired.

• Question 7 (20): These question deals with AFA's (alternating finite automata) as defined above.

• (a, 10): Let M be an AFA with k states, let w be any string over the input alphabet of M, and define a function fw from P(Q) to {0, 1} as follows. (Recall that P(Q) is the power set of the state set Q, that is, P(Q) = {S: S ⊆ Q}.) For any set S, consider the game played on M with input w whre White wants to leave M in some state in S at the end and Black wants the state to not be in S. Then fw(S) = 1 if and only if White has a winning strategy for this game. Prove that if for two strings u and v, the functions fu and fv are the same, then for any string z, the strings uz and vz are either both in L(M) or both not in L(M).

Consider the M-game on both uz and vz. Let S be the set of states q for which White wins the M-game on z, starting from q. White wins the M-game on uz if and only if she can force the state after u to be in S. This is true if and only if fu(S) = 1, which by the assumption is true if and only if fv(S) = 1. Similarly, fv(S) = 1 if and only if White wins the M-game on vz. So under optimal play, White either wins both games or wins neither, and the two strings are either both in L(M) or both out.

• (b, 10): Using the result of part (a), whether you proved it or not, prove that L(M) is a regular language. If we use this result to construct a DFA equivalent to M, how many states does the DFA have?

There are 22k possible functions fu for any string u. To define a function, we need a binary value for each of the 2k states in P(Q).

Part (a) proves that if fu = fv, then u and v are L(M)-equivalent with respect to the Myhill-Nerode Theorem. So there are at most 22k equivalence classes, and hence a DFA with at most 22k states must exist and L(M) is regular.

• Question 8 (15): Let NEVER-LEFT be the set of pairs (M, w) such that M is a deterministic Turing machine with a read-only input tape (and any constant number of ordinary tapes), w is a string in M's input alphabet, and M accepts w without ever moving left on its input tape. Prove that the language NEVER-LEFT is Turing recognizable but not Turing decidable. (Note added during test: do not make any assumptions about the machine M other than those you are given.)

To show that NEVER-LEFT is Turing recognizable, we need to show that there is a single Turing machine M' such that L(M') = NEVER=LEFT. This is easy -- on input (M, w), M' runs M on w until or unless it either halts or tries to move left on its input tape. If M tries to move left when running on w, M' rejects that input pair. Clearly M' accepts if and only if (M, w) is in NEVER-LEFT.

For the other direction, we show that NEVER-LEFT is not TR by showing that ATMm NEVER-LEFT. Here we use the argument of Question 5. Given an arbitrary TM M, we build a two-tape OWITM M' such that L(M') = L(M), using Question 5. Now since M' cannot move left on its input tape, we have that (M', w) ∈ NEVER-LEFT if and only if (M', w) ∈ ATM if and only if (M, w) ∈ ATM. So the map that takes (M, w) to (M', w) is the desired reduction from ATM to NEVER-LEFT.

• Question 9 (30): Here are two NP-completeness proves for languages defined above. You will probably want to choose from the given NP-complete languages there for your reductions.

• (a, 15) Prove that the language 3-TSP is NP-complete.

This problem was easier than I intended because I forgot to say that the usual input format for TSP has a labeled edge for every pair of nodes in G. In that case we take the undirected graph G and make a G' that has an edge labeled 1 for every edge in G and an edge labeled 2 for every non-edge in G. Then, if n is the number of nodes in G or in G', we have that G is in HAM-PATH iff (G', n-1) is in 3-TSP. Any Hamilton path contains n-1 edges, and we meet the n-1 total length target if and only if all the edges are real ones.

But I allowed you to have non-edges in G', so you could have G' be identical to G, with all edges labeled 1. You could choose a target of n-1, or anything larger, because any Hamilton path will do.

• (b, 15) Prove that the langauge EXACT-PATH is NP-complete. (Hint: Consider the definition of the graph GS for a sequence S, given above.)

The language EXACT-PATH is in NP because the path is the certificate -- given an alleged path we can compute its total length, verify that it is a legitimate path, and compare the length with t.

We show that SUBSET-SUM ≤p EXACT-PATH, which suffices because we are given that SUBSET-SUM is NP-complete. Given a sequence S = (s1,... , sn) and a target t, let f(S, t) = (GS, a0, an, t + 2n), where GS is the graph defined above. Any path from a0 to an in GS corresponds to a subset X of {1,... ,n} , the indices i for which the path takes the edge from ai-1 to bi. The length of the path is 2n + (the sum over all i in X of si). This length equals 2n + t if and only if the sum is exactly t, so a path of length 2n + t exists if and only if a subset of total weight exactly t exists. Thus the input instance (S, t) is in SUBSET-SUM if and only if our f(S, t) is in EXACT-PATH.