# Solutions to Second Midterm Exam, Spring 2011

### Directions:

• Answer the problems on the exam pages.
• There are eight problems for 125 total points. Actual scale was A=100, C=64.
• If you need extra space use the back of a page.
• No books, notes, calculators, or collaboration.
• The first six questions are true/false, with five points for the correct boolean answer and up to five for a correct justification of your answer -- a proof, counterexample, quotation from the book or from lecture, etc. -- note that there is no reason not to guess if you don't know.

```  Q1: 10 points
Q2: 10 points
Q3: 10 points
Q4: 10 points
Q5: 10 points
Q6: 10 points
Q7: 40 points
Q8: 25 points

Total: 125 points
```

Question text is in black, solutions in blue.

Some questions refer to two particular kinds of one-tape deterministic Turing machines. A Double=move Turing machine (DMTM) always moves exactly two squares left or right on a move. A Double-right Turing machine (DRTM) moves two squares when it moves right, but one square when it moves left. Recall that our Turing machines start on the leftmost square of the tape, on the first letter of the input (if any), and do not move if they are ordered to move left from leftmost square. We define ADMTM to be the set of pairs (M, w) where M is a DMTM, w is a string over the correct alphabet, and w ∈ L(M). We define ADRTM similarly.

The language REGCFG is the set of context-free grammars G such that L(G) is regular.

The language REGTM is the set of Turing machines M such that L(M) is regular.

The language CFLDFA is the set of DFA's M such that L(M) is a context-free language.

Recall that for two languages A and B, A ≤m B means that there is a function f, computed by a Turing machine that always halts, such that for any string w, w ∈ A if and only if f(w) ∈ B.

• Question 1 (10): True or false with justification: Any regular language is the language of some DMTM, as defined above.

FALSE. Consider the language (10)*. As the DMTM moves, it will only ever see the odd-numbered positions of the tape and thus never change the even positions. Any decision it makes on reading 11 will also be made on 10, but one of these two decisions must be wrong because 11 is not in the language and 10 is.

• Question 2 (10): True or false with justification: REGCFGm REGTM, where these languages are defined above.

TRUE. We want f to be a function from grammars to TM's, so that f(G) = M and L(G) is regular if and only if L(M) is. The easiest way to do this is to have L(G) = L(M). Given G, f constructs a TM that implements a CKY parser for G, or otherwise solves the decision problem for G, by accepting its input w if and only if w ∈ L(G).

• Question 3 (10): True or false with justification: Recall the construction from the proof in Sipser that ALLCFG is not decidable. Define the function f to take a Turing machine M and a string w and output the grammar, given by that construction, for the language of all strings that are not acception computation histories of M on w. Then f proves that ATMm REGCFG, i.e., f is a mapping reduction from ATM to REGCFG.

FALSE. If f were such a function, it would take pairs (M, w) where M does not accept w to grammars with non-regular languages. But the given construction takes such pairs to Σ*, which is regular.

It's a more interesting question whether this f is a mapping reduction from ATM to the complement of REGCFG. For this to be true, the language of the grammar f(M, w) would have to be non-regular whenever M does accept w. But the language of this grammar is the complement of the set of accepting computation histories of M on w. In one manner of speaking, there is only one of these, but we probably want to consider a string to be an accepting computation history no matter how many blanks it has at the each of its configurations. The set of such strings, however, is still regular, so its complement is regular and f is not a reduction to the complement of REGCFG -- it always produces grammars with regular languages.

• Question 4 (10): True or false with justification: The language CFLDFA, defined above, is Turing decidable.

TRUE. Since every regular langauge is a CFL, every DFA has a language that is a CFL, and our decision procedure should return "true" for every valid DFA. It only has to tell whether its input represents a valid DFA, which an always-halting Turing machine clearly can do.

• Question 5 (10): True or false with justification: Any Java program that always outputs the same string w must be longer, as a string, than w itself.

FALSE. One argument for this uses our discussion of the Recursion Theorem. In lecture we proved that there exists a Turing machine SELF that outputs its own description. By the same argument, one could build a Java program that outputs its own description, and hence would not be longer than its output.

But in any case it is easy to construct a Java program that gives an output longer than its own source code:

``````
public static void main (String [] args) {
for (int i=0; i < 1000000; i++)
System.out.println("a");}
``````

• Question 6 (10): True or false with justification: It is not true that REGTMm ATM.

TRUE. In lecture we proved that the complement of ATM mapping reduces to REGTM. Our function takes a Turing machine M and a string w and produces a new Turing machine N as follows. On input x, N accepts if x is of the form 0n1n, and otherwise runs M on w. L(N) is a non-regular language if M does not accept w, and is the regular language Σ* otherwise.

If REGTMm ATM were true, REGTM would be TR, but the complement of ATM is not TR, and thus any language it reduces to is not TR.

• Question 7 (40): Two types of Turing machines, DMTM's and DRTM's, were defined above.

• (a,10) If M is any ordinary one-tape Turing machine, prove that there exists a DRTM M' such that L(M) = L(M').

M' will have the same alphabet as M and have copies of all of M's states. Any transition of M that moves left is replicated exactly in M'. If (q,a;r,b,R) is any right-moving transition in M, we add a new state s to M' (a different new state for every such transition), replace (q,a;r,b,R) with (q,a;s,b,RR) and then add a transition (s,x;r,x,L) for every letter x in the alphabet. Thus any computation of M is replicated exactly by M' -- every right move in M is matched by a double-right move followed by a left move. Clearly any string w will now be accepted by M' if and only if it is accepted by M, so L(M) = L(M').

To reduce ATM to ADRTM, we need a function that takes a pair (M, w), where M is an ordinary TM, and produces a pair (M, w), where M' is the machine built from M in part (a) and w is the same string in the input and output.

To reduce ADRTM to ATM, we need to take a pair (M, w), where now M is an arbitrary DRTM, and produce a pair (M', w) where M' is now an ordinary TM such that L(M) = L(M'). To do this we again have M' have the same alphabet as M, have copies of all the states of M, and copy all left-moving transitions from M into M'. We again create a new state s in M' for every right-moving transition (q,a;r,b,RR) in M, and give M' the new transition (q,a;s,b,R) and a new transition (s,x;r,x,R) for every letter x in the alphabet. Now any computation of M is replicated exactly by M', taking two single-right moves to mimic any double-right move of M. Clearly a string w is accepted by M' if and only if it is accepted by M, so our mapping reduction from (M, w) to (M', w) is correct.

• (c,10) If M is any ordinary one-tape Turing machine, prove that there is a language X such that X is the language of some DMTM and L(M) ≤m X.

Let X be the set of all strings w = w1w2...wn such that the string odd(w) w1w3...wk, where k = n if n is odd and k = n - 1 if n is even, is in L(M). Note that a string is in L or not entirely independently of what letters occur in its odd-numbered positions.

X is the language of a DMTM M' that has the same states as M, but always moves two spaces left when M moves one space left, and moves two spaces right when M moves one space right. If w is any string at all, the computation of M' on w follows the computation of M on odd(w), and will thus accept if and only if odd(w) is in L(M), which is the definition of when w is in X.

We let the function f take an arbitrary string v = v1...vn to the string v1v1v2v2...vnvn. Note that then odd(f(v)) = v. Clearly v is in L(M) if and only if f(v) is in X, so L(M) ≤m X.

No one gave me a good definition of X, so no one got full credit for this problem. Some people defined X as, for example, the set of strings w1\$w2\$...wn\$ such that w1w2...wn is in L(M). This language satisfies L(M) ≤m X, but it is not the language of any DMTM because a DMTM cannot tell whether the even-numbered squares of the input contain \$'s.

• (d,10) Prove that ATMm ADMTM.

We want a function g such that g(M, w) = (M', w') where M is an arbitrary ordinary TM, w is a string over the input alphabet of M, M' is a DMTM, and w' is another string, and M accepts w if and only if M' accepts w'. Given any M, let M' be the DMTM constructed in part (c) and let w' = f(w), where f is the function of part (c). Now the computation of M' on w' will mimic the computation of M on w step for step, and either both will accept or both will not.

• Question 8 (25): Recall the Regular Language Pumping Lemma and the CFL Pumping Lemma. We say that a language A "regular-pumps at the beginning" if there exists a number p such that for every string w with w ∈ A and |w| ≥ p, w can be written as xyz where |xy| ≤ p, |y| > 0, and for all i, xyiz is in A.

We say that A "CFL-pumps at the beginning" if there exists p such that for every string w with w ∈ A and |w| ≥ p, w can be written as uvxyz where |uvxy| ≤ p, |vy| > 0, and for all i, uvixyiz is in A.

• (a,10) Review the argument that every regular langauge regular-pumps at the beginning. You may be informal, but make clear why the condition |xy| ≤ p can be guaranteed.

All we need here is the proof of the RLPL as given in Sipser, as it says that any regular language "regular-pumps at the beginning". Let X be any regular language, let M be a DFA such that L(M) = X, and let p be the number of states of M. Let w be any string in X with |w| ≥ p. As M reads the first p letters of w, it takes on p+1 states -- the state after reading 0 letters, after reading 1 letter, after 2 letters, and so forth until the state after reading p letters. Since there are only p states in M, some state must occur twice in this sequence. We let x be the string read up to the first visit to this state, xy be the string read up to the second visit, and z be the part of w read after the second visit. Then the conclusions of the RLPL are satisfied, and |xy| ≤ p because the revisit must occur during the first p letters of w.

• (b,15) Prove that it is not true that every context-free language CFL-pumps at the beginning.

Let G be the grammar S → 0S1, S → ε, so that L(G) is the language {0n1n: n ≥ 0}. Given any number p, let w be the string 0p1p. If w is written uvxyz with |uvxy| ≤ p, then the strings v and y consist entirely of 0's. If we pump up or down, then, we change the number of 0's in the string without changing the number of 1's, and take the string out of L(G). Thus this language does not "CFL-pump at the beginning", no matter which p we choose. The CFL Pumping Lemma says that we can pick u, v, x, y, and z so that |vxy| ≤ p, not so that |uvxy| ≤ p.

Only one person got this right, which surprised me. Two people thought that a finite language would not CFL-pump at the beginning. But the condition only says that all sufficiently long strings in the language must pump. If the language is finite, we can choose p so that there are no strings in the language of length p or greater, so the condition is vacuously satisfied.