Solutions to Second Midterm Exam, Spring 2017

Directions:

• Answer the problems on the exam pages.
• There are eight problems (some with multiple parts) for 125 total points. Actual scale was A = 100, C = 64.
• If you need extra space use the back of a page.
• No books, notes, calculators, or collaboration.
• The first six questions are statements -- in each case say whether the statement is true or false and give a convincing justification of your answer -- a proof, counterexample, quotation from the book or from lecture, etc. You get five points for the correct boolean answer (so there is no reason not to guess if you don't know) and up to five for the justification.

Exam text is in black, solutions in blue.

```  Q1: 10 points
Q2: 10 points
Q3: 10 points
Q4: 10 points
Q5: 10 points
Q6: 10 points
Q7: 30 points
Q8: 35 points
Total: 125 points
```

The set N of natural numbers is {0, 1, 2, 3,...}, not quite as defined in Sipser.

For this exam, the input alphabet of all machines will be Σ = {0, 1}.

A palindrome is a string that is equal to its own reversal. Let PAL = {w: w is a palindrome}.

If C is any class of things that have languages, such as DFA's, CFG's, TM's, strange variant TM's, etc., remember that (M) is the canonical string representing the thing M in C. We define the following languages:

• AC = {(M, w): M is a computer in C and w ∈ L(M)}

• EC = {(M): M is a computer in C and L(M) = ∅}

• ALLC = {(M): M is a computer in C and L(M) = Σ*}

• SINGC = {(M): M is a computer in C and L(M) has exacctly one element}

A language is Turing recognizable (TR) if it is equal to L(M) for some Turing machine M.

A language is Turing decidable (TD) if it is equal to L(M) for some Turing machine M that halts on every input.

A language is co-TR if and only if its complement is TR. (Similarly for co-NP, etc.)

A function f from strings to strings is Turing computable if there exists a Turing machine M such that for any string w, M when started on w halts with f(w) on its tape. It is poly-time computable if in addition, M always computes f(w) using time polynomial in the length of w.

Recall that if A and B are two languages, A is mapping reducible to B, written A ≤m B, if there exists a Turing computable function f: Σ* → Σ* such that for any string w, w ∈ A ↔ f(w) ∈ B. A is poly-time reducible to B, written A ≤p B, if in addition the function f is poly-time computable.

The language BPCP, for Bounded Post Correspondence Problem, is the set of pairs (P, m) were P is a sest of dominoes (as in the ordinary PCP), m is a number written in binary, and P has a match that uses at most m total dominoes. (Recall that a match can use the same domino multiple times, and we are counting them with multiplicity. For example, the same domino used 100 times counts as 100 dominoes.)

The following languages were proved to be NP-complete either in the text of Sipser or in the exercises. You may assume without proof that they are NP-complete.

• 3-SAT = {φ: φ is a satisfiable formula in 3-CNF}

• 3-COLOR = {G: G is an undirected graph and the vertices of G may each be assigned one of three colors so that no edge connects two vertices of the same color}

• DHAMPATH = {(G, s, t): G is a directed graph in which there is a directed path from vertex s to vertex t such that the path visits each vertex of G exactly once}

• SUBSET-SUM = {(a1,..., ak, t): the ai's and t are binary positive integers and there is a subset of the ai's that adds to exactly t}

• CLIQUE = {(G, k): G is an undirected graph and G has a set S of k vertices such that every two distinct vertices in S have an edge between them}

• VERTEX-COVER = {(G, k): G is an undirected graph and G has a set S of k vertices such that every edge in G has at least one endpoint in S}

The following two languages are to be proved NP-complete on this exam:

• PIC = {(G, k): G is an undirected graph and the vertices of G may be partitioned into k disjoing subsets such that for each subset S, every two distinct vertices in S have an edge between them} (The name of the language abbreviates Partition Into Cliques.)

• KNAPSACK = {(w1, v1),..., (wk, vk), wt, vt): the wi's, vi's, wt, and vt are binary positive integers and there is subset of the i's such that the wi's in the subset add to exactly wt, and the vi's in the subset add to at least vt.} (Think of the input as defining k items, each with a weight and a value. We want a set of items for our knapsack that meets the weight target exactly, while meeting or exceeding the value taget.

• Question 1 (10): True or false with justification: The set {(M): (M) ∈ L(M)}, where the type of M is "Turing machine", is co-TR.

FALSE. It is TR as an accepting computation history of M on (M) exists if an only if (M) is in the set, and the property of being such a history is TD. Is it not TD as was proved in Chapter 4 of Sipser; if it were TD we could build a decider for the complement of this set, which would cause a contradiction when fed to itself.

Many people gave proofs involving the Recursion Theorem, most of which were correct.

• Question 2 (10): True or false with justification: The language BPCP defined above is not TD.

FALSE. Given P and M, the number of potential matches is the sum of di for i from 1 to m, where d is the number of different dominoes in P. This number is finite, and an always-halting TM can check them all and accept if and only if one of them is a match.

• Question 3 (10): True or false with justification: The language SINGTM is either TR or co-TR, but not both.

FALSE. It is neither. We can reduce ATM to SINGTM by mapping the pair (M, w) to a machine N that rejects its input x unless x = w, in which case it runs M on w and accepts if and only if M does. Thus L(N) = {w} if w ∈ L(M), and L(N) = ∅ otherwise.

We can reduce the complement of ATM to SINGTM by mapping the pair (M, w) to a machine N that accepts its input x if x = w, and otherwise runs M on w and accepts if and only if it accepts. Thus L(N) = {w} if w ∉ L(M), and L(N) = Σ* otherwise.

• Question 4 (10): True or false with justification: The set of incompressible strings I = {x: K(x) ≥ |x|}, is TR.

FALSE. An easy proof: We know that I is not TD from a homework problem. It is co-TR because if a string x is not incompressible, there exist a machine M and a string w such that |(M, w)| < |x| and M on input w returns x. This is a TR property. Since it is co-TR and not TD, it cannot be TR.

The proof in the homework solutions that I is not TD actually suffices to prove that it is not TR. If it were TR, there would be an enumerator for it. Using the Recursion Theorem, we can build a machine R that ignores its input, runs E until it outputs a string x that is longer than the string (R, ε), and outputs x. Since x is described by the string (R, ε), it is not incompressible.

• Question 5 (10): True or false with justification: Let A and B be any two languages. If A ∉ P, A ≤p, and B ≤p A, then both A and B must be NP-complete.

FALSE. To be NP-complete, A and B must be in NP, and there is no reason that they should be. But we need a concrete example of an A and a B that meet the conditions and are not NP-complete. My example is A = B = ATM. A is not in P, and neither A nor B are in NP, because they are undecidable. And the two poly-time reductions can each be the identity function.

• Question 6 (10): True or false with justification: There exists a Turing machine R with the following property: For every string w, R accepts w if and only if w ∉ L(R).

FALSE. This was a bit nasty on my part, I admit. The Recursion Theorem can be used to prove statements that look at first glance to be nonsense. But this statement actually is nonsense: Since "R accepts w" and "w ∈ L(R)" mean exactly the same thing, we can never have one true and the other false.

• Question 7 (30): These two questions both deal with languages defined above.

• (a, 15) Prove that the language PIC is NP-complete.

We first prove that PIC ∈ NP. Let PIC-CHECK be the set of all (G, k, S) such that S is a partition of the vertices into k cliques. It is easy to decide membership in PIC-CHECK in P, because we only need to verify that each vertex occurs in exactly one set of S, that there are exactly k sets in S, and that each of the sets in S is a clique. Then (G, k) ∈ PIC if and only if ∃ S: (G, k, S) ∈ PIC-CHECK.

We complete the proof that PIC is NP-complete by showing that 3-COLOR ≤p PIC. Given an undirected graph G, we map G to the pair (G-bar, 3), where G-bar is the complement graph of G that has the same vertices and has edges exactly where G does not. If G has a three-coloring, then G-bar's vertices may be paritioned into the three color sets, and since no edge of G connects two vertices of the same color, every edge between vertices of the same color exists in G-bar, and the three color sets are cliques in G-bar. Similarly, if there is a partition of the vertices of G-bar into three cliques, we can assign a color to each clique and get a 3-coloring of G.

Many people tried to reduce CLIQUE to PIC (or worse, PIC to CLIQUE) but failed, often because they did not recognize the very different roles of the parameter k in the two problems.

• (b, 10) Prove that the language KNAPSACK is NP-complete.

We first prove that KNAPSACK ∈ NP. Let KNAPSACK-CHECK be the set of pairs consisting of a KNAPSACK instance and a subset S of the i's, such that the sum of the weights in the subset meets the weight target and the sum of the values in the subset meets or exceeds the value target. Membership in KNAPSACK-CHECK is easy to decide in P by just adding up the weights and the values. And an instance I is in KNAPSACK if and only if ∃S: (I, S) ∈ KNAPSACK-CHECK.

To complete the proof, we show SUBSET-SUM ≤P KNAPSACK. We map an instance (a1,..., ak, t) to ((a1, a1,..., (ak, ak), t, t). Then a subset of the i's that meets the target of the SUBSET-SUM instance also meets both the weight and value targets of the KNAPSACK instance, and vice versa. (Some people set the values of the items, and the value target, to 0, which works fine except that these numbers are defined to be positive. I took off one point for this. Some others set each value and the value target to 1, which works because we must choose at least one item, and thus must meet or exceed the value target if we meet the weight target.)

• Question 8 (35): These three questions all deal with languages of the form SINGC as defined above. If C is any class of things that have languages, then SINGC is the set of those things whose languages have exactly one element.

• (a, 15) Prove that SINGTM is not TD.

Either of the two proofs in Question 3 above suffice to solve this problem as well. There were other proofs -- one cute one was to show ETM ≤ SINGTM by altering the input TM by adding a new letter to its alphabet, and having it accept exactly one string containing the new letter.

• (b, 10) Prove that SINGCFG is TD.

There is a brute-force but valid proof in the style of Sipser's Chapter 4. Given an input CFG G, we calculate p, the constant in the CFL Pumping Lemma, and check every string of length up to 2p for membership in L(G), using the decidability of ACFG. If there are no such strings, there cannot be any larger strings (or else they would pump down) and we can reject. If there is exactly one, there also cannot be any larger ones and we can accept. (That one string must have length less than p, or else it would pump down.) And if there are two or more strings, of course we reject.

In fact this language is in P. We can test G for membership in ECFG by successively marking each non-terminal that can be used to derive a string, keeping track of the strings as we go. If L(G) = ∅, of course we reject. Otherwise we find a string w in L(G). We then create a grammar G' whose language is the intersection of L(G) and the regular language {x: x ≠ w}. Now if L(G') = ∅ we accept, and otherwise reject. (I'm using without proof the fact that the conversion from CFG's to PDA's and back again can be carried out in P.)

• (c, 10) Prove that SINGNFA is in co-NP.

This language is actually NL-complete, as can be shown with the Immerman-Szelepcsenyi Theorem, but our task here is easier. An NFA is in SINGNFA-bar if and only if its language is either empty or has two or more strings. Let X be the set of triples (N, u, v) such that either L(N) = ∅ or u and v are distinct strings in L(N). Clearly (N) is in SINGNFA-bar if and only if ∃u:∃v: (|u|. |v| ≤ |(N)|) ∧ ((N, u, v) ∈ X). And X is in P because we can test emptiness of L(N) by seeing whether any path in N goes from the start state to a final state, and we can also test membership in L(N) using the known algorithm for ANFA.