CMPSCI 501: Theory of Computation

Solutions to Second Midterm Exam, Spring 2015

David Mix Barrington

26 March 2015

Directions:

Answer the problems on the exam pages.
There are eight problems (some with multiple parts) for 125 total points plus 10 extra credit. Actual scale was A = 100, C = 60.
If you need extra space use the back of a page.
No books, notes, calculators, or collaboration.
Many useful definitions are given below.
The first six questions are statements -- in each case say whether the statement is true or false and give a convincing justification of your answer -- a proof, counterexample, quotation from the book or from lecture, etc. You get five points for the correct boolean answer (so there is no reason not to guess if you don't know) and up to five for the justification.

  Q1: 10 points
  Q2: 10 points
  Q3: 10 points
  Q4: 10 points
  Q5: 10 points
  Q6: 10 points
  Q7: 35 points
  Q8: 30+10 points
 Total: 125+10 points

Question text is in black, solutions in blue.

Correction in red added 20 March 2016.

The set N of natural numbers is {0, 1, 2, 3,...}, not quite as defined in Sipser.

If C is any class of computers, such as DFA's, CFG's, LBA's, TM's, strange variant TM's, etc.:

A_C = {<M, w>: M is a computer in C and w ∈ L(M)}
E_C = {<M>: M is a computer in C and L(M) = ∅}
ALL_C = {<M>: M is a computer in C and L(M) = Σ^*}
ALLEVEN_C = {<M>: M is a computer in C and L(M) includes all strings of even length in Σ^*}. That is, ALLEVEN_C = {(M): (ΣΣ)^* ⊆ L(M)}.
ALLODD_C = {<M>: M is a computer in C and L(M) includes all strings of odd length in Σ^*}. That is, ALLODD_C = {(M): Σ(ΣΣ)^* ⊆ L(M)}.

A language is Turing recognizable (TR) if it is equal to L(M) for some Turing machine M.

A language is Turing decidable (TD) if it is equal to L(M) for some Turing machine M that halts on every input.

A language X is co-TR if and only if its complement is TR.

A function f from strings to strings is Turing computable if there exists a TM M such that for any string w, M when started on w halts with f(w) on its tape.

Recall that if A and B are two languages, A is mapping reducible to B, written A ≤_m B, if there exists a Turing computable function f from Σ^* to Σ^* such that for any string w, w ∈ A ⇔ f(w) ∈ B.

A Deterministic Infinite Automaton (DIA) has a state set Q = {q_i: i∈ N} = {q₀, q₁, q₂,...}, a start state q₀, a nonempty alphabet Σ, a final state set F ⊆ Q, and a transition function δ from Q × Σ to Q. It computes like a DFA, beginning at the left end of its finite input string, moving right one letter and updating its input string on every computation step, and finally accepting the input if and only if it finishes in a final state.

A Turing Computable DIA (TCDIA) is a DIA where the set F is Turing decidable and the function δ is Turing computable.

An All-String Turing Machine (ASTM) is a deterministic three-tape Turing machine with the following restrictions. The input alphabet Σ is {0, 1}, and the tape alphabet Γ contains Σ, a blank symbol, and perhaps other letters.

Tape 1 is read-only and gets the input string w on it, with a marker symbol at each end.

Tape 2, at any time during the computation, contains a string x ∈ Σ^* between two marker symbols. At the beginning of the computation x is the empty string. The computation is divided into phases, during each of which Tape 2 remains unchanged. After the first phase x is changed to 0, after the second phase it is changed to 1, after the third to 00, and so on through all the possible strings of Σ^* until or unless the machine accepts or rejects. No other computation takes place while Tape 2 is being updated at the end of each phase.

Tape 3 is read-write, and after every phase it is erased and reset to be blank, with two end markers that restrict the size of the useful portion of the tape to the current length of x.

As with the language of an ordinary TM, the language of an ASTM is the set of input strings w that cause it to eventually accept.

Question 1 (10): True or false with justification: The language ALLEVEN_CFG ∩ ALLODD_CFG is not Turing decidable.
TRUE. This language is exactly ALL_CFG, which was shown in the text and in lecture to not be TD.
Question 2 (10): True or false with justification: The language ALLEVEN_CFG is Turing recognizable.
FALSE. We show that A_TM-bar ≤_m ALLEVEN_CFG. The construction involving accepting computation histories in the book and lecture shows that A_TM-bar _m ALL_CFG, since it takes M and w and creates a grammar that generates all strings that are not accepting computation histories of M on w (in the format where every other machine configuration is written backward).
So all we need to do is to show ALL_CFG ≤_m ALLEVEN_CFG. This means that given any grammar G, we must create a grammar G' that generates all even-length strings if and only if G generates all strings. One way to do this is to have G' generate all even-length strings whose odd-numbered letters form a string in L(G), and whose even-numbered letters are arbitrary. We can do this as follows. Assume G is in Chomsky normal form. Form G' by replacing every rule of the form A → a by the rules A → A'X, A' → a, and X → b for every letter b in Σ.
An alternate solution is to show that ALLEVEN_CFG and ALLODD_CFG are Turing equivalent, using an argument similar to that of Question 8 (b) on this exam. If we show both ALLEVEN_CFG ≤_m ALLODD_CFG and ALLODD_CFG ≤_m ALLEVEN_CFG, we know that if one of these languages is TR, then so is the other. If both are TR, their intersection is TR, and the proof that ALL_CFG is not TD (from text and lecture) shows that it is also not TR.
Question 3 (10): True or false with justification: There exists a DIA whose language is neither TR nor co-TR.
TRUE. In fact any language X at all is the language of some DIA. To see this, note that we can design δ to take every possible string to a different state. For example, if Σ has k letters a₁,..., a_k, we can define δ(q_n, a_i) to be q_{(k+1)n + i}, so that δ^*(q₀, w) is the number denoted by the sequence of subscripts of the letters of w in base k+1 notation. Once we have a state for each string, we just define F to be the set of states to which δ^* takes the strings in X. Of course if X is not a TD language, F will not be TD either.
Question 4 (10): True or false with justification: The language of any TCDIA is TD.
TRUE. We can design a TM that stores the current state of the TCDIA on a tape. It then just scans the input, using the TM for δ to update the state after each letter and using the decider for F at the end to decide whether to accept. This is a finite number of δ computations followed by one F computation, and the hypotheses say that each of these will eventually halt.
Question 5 (10): True or false with justification: The set of DIA languages and the set of TCDIA languages are both uncountable.
FALSE. Since the set of DIA languages is the set of all languages, it is in fact uncountable. But the set of TCDIA languages is countable, because each such language can be specified by a finite string, giving the descriptions of the Turing machines for that TCDIA's version of δ and F.
It also follows, from the fact that each TCDIA language is TD, that there are countably many, since we observed that there are only countably many TD languages.
Question 6 (10): True or false with justification: If a Turing machine M prints out the string w no matter what its input is, the description of M must be at least as long as w.
FALSE. The fact that a briefly-described machine could possibly output a very long string should be pretty obvious, though it's not as obvious to prove. By the Recursion Theorem, since we can make a TM that inputs a string w and prints out ww, there must be some TM R that prints out (R)(R) (two copies of R's description), a string that is clearly longer than R.
A more concrete proof defines a family of machines {M_w} for each binary string w, where M_w ignores its input and prints out n ones, where n is the number denoted by w in binary notation. We could build such a machine with description length O(|w|), by having it write out w on its tape and then run some code that is constant for all w. But the length of the output (if, say, w is a string of ones) is about 2^|w|, which is larger than the machine description length for long enough w.
Question 7 (35): These three questions all deal with the model of All-String Turing machines (or ASTM's) defined above.
- (a, 10) Explain, in terms of individual states and moves, how an ASTM can update the string on its Tape 2.
  The machine moves to the right marker of Tape 2. It then moves left until it finds a 0 or the left marker, changing any 1's it finds on the way to 0's. If it finds a 0, it changes it to a 1 and stops. If it finds the left marker, it goes back to the right one cell. It then goes right past any 0's to find the right marker, writes a 0 over it, goes right, and writes a new right marker (thus completing an update of x from 1^k to 0^k+1), and stops.
- (b, 15) Prove that if X is any language, X is Turing recognizable if and only if it is the language of some ASTM.
  If X is the language of some ASTM, it is clearly TR because the ASTM is itself a TM. (We could convert the three-tape ASTM to a one-tape TM, or just quote the known fact that languages of three-tape TM's are TR.)
  So the interesting part is to show that for every one-tape TM M, there is an ASTM with the same language. My intended proof, which no one got, was to cycle through all strings x on Tape 2, and for each one do the following: Copy x to Tape 3, then check whether x happens to be an accepting computation of M on w (subject to the encoding of the acception computations as binary strings). We can do this check while obeying the ASTM restriction because we know from lecture and the text that this can be done with an LBA. If w is in L(M), we will find the correct x and accept, and if not we will search for it forever.
  Those of you who got this right used the following argument. Again cycle through all possible strings x. Skip past all the x's that are shorter than w. Then for each x longer than w, copy w onto Tape 3 and attempt to simulate the computation of M on w. If it accepts or rejects, do likewise. If it runs forever, so be it -- the simulation runs forever. If it hits the right marker of Tape 3, stop, update x, and start over by copying w to Tape 3 again. If w is in L(M), we will eventually get to an x that is long enough that we can complete the computation and accept. If not, we will either reject or go into a loop in one of the phases, or continue through all possible x forever.
- (c, 10) Prove that if X is any language, X is Turing decidable if and only if it is the language of some ASTM that halts on every possible input.
  The construction is essentially the same as for part (b), in either version. In my version, also check for an x that is a rejecting computation history and reject if you find it. Since M is a decider, our ASTM will eventually either accept or reject when it finds the correct history. In the other version,once x is long enough we will be able to complete either an accepting or a rejecting computation, since as M is a decider one or the other must exist.
Question 8 (30+10): These questions deal with the ALLEVEN and ALLODD languages defined above.
- (a, 15) Prove that the language ALLEVEN_TM is not Turing decidable. (There are several valid ways to do this.)
  Several valid ways indeed:
- (b, 15) Prove that ALLEVEN_TM ≤_m ALLODD_TM.
  Given a Turing machine M, we must build a Turing machine N so that M accepts all even-length strings if and only if N accepts all odd-length strings. Let N delete the first letter of its input and simulate M on the remaining letters. (We don't really care what N does on empty input, since the empty string has even length.) There is then an odd-length string not accepted by M if and only if there is an even-length string not accepted by N.
- (c, 10XC) Prove that the language ALLEVEN_CFG ∪ ALLODD_CFG is co-TR.
  The language in question here is the set of CFG's the either generate all even-length strings, or generate all odd-length strings, or both. The complement of this language is the set of CFG's that both fail to generate some even-length string and fail to generate some odd-length string. Since E_CFG is TD, we can build a TM that tests every string for membership in L(G), until or unless it finds both an even-length string and an odd-length string that are not in L(G). If this happens it accepts, and otherwise it searches forever. This TM's language is the complement of the given language, so the given language is co-TR.

Last modified 20 March 2016