Q1: 10 points Q2: 10 points Q3: 10 points Q4: 10 points Q5: 10 points Q6: 10 points Q7: 35 points Q8: 30+10 points Total: 125+10 points
Question text is in black, solutions in blue.
Correction in red added 20 March 2016.
The set N of natural numbers is {0, 1, 2, 3,...}, not quite as defined in Sipser.
If C is any class of computers, such as DFA's, CFG's, LBA's, TM's, strange variant TM's, etc.:
A language is Turing recognizable (TR) if it is equal to L(M) for some Turing machine M.
A language is Turing decidable (TD) if it is equal to L(M) for some Turing machine M that halts on every input.
A language X is co-TR if and only if its complement is TR.
A function f from strings to strings is Turing computable if there exists a TM M such that for any string w, M when started on w halts with f(w) on its tape.
Recall that if A and B are two languages, A is mapping reducible to B, written A ≤m B, if there exists a Turing computable function f from Σ* to Σ* such that for any string w, w ∈ A ⇔ f(w) ∈ B.
A Deterministic Infinite Automaton (DIA) has a state set Q = {qi: i∈ N} = {q0, q1, q2,...}, a start state q0, a nonempty alphabet Σ, a final state set F ⊆ Q, and a transition function δ from Q × Σ to Q. It computes like a DFA, beginning at the left end of its finite input string, moving right one letter and updating its input string on every computation step, and finally accepting the input if and only if it finishes in a final state.
A Turing Computable DIA (TCDIA) is a DIA where the set F is Turing decidable and the function δ is Turing computable.
An All-String Turing Machine (ASTM) is a deterministic three-tape Turing machine with the following restrictions. The input alphabet Σ is {0, 1}, and the tape alphabet Γ contains Σ, a blank symbol, and perhaps other letters.
Tape 1 is read-only and gets the input string w on it, with a marker symbol at each end.
Tape 2, at any time during the computation, contains a string x ∈ Σ* between two marker symbols. At the beginning of the computation x is the empty string. The computation is divided into phases, during each of which Tape 2 remains unchanged. After the first phase x is changed to 0, after the second phase it is changed to 1, after the third to 00, and so on through all the possible strings of Σ* until or unless the machine accepts or rejects. No other computation takes place while Tape 2 is being updated at the end of each phase.
Tape 3 is read-write, and after every phase it is erased and reset to be blank, with two end markers that restrict the size of the useful portion of the tape to the current length of x.
As with the language of an ordinary TM, the language of an ASTM is the set of input strings w that cause it to eventually accept.
TRUE. This language is exactly ALLCFG, which was shown in the text and in lecture to not be TD.
FALSE. We show that ATM-bar ≤m ALLEVENCFG.
The construction involving accepting computation histories in the book and
lecture shows that ATM-bar So all we need to do is to show ALLCFG ≤m
ALLEVENCFG. This means that given any grammar G, we must create
a grammar G' that generates all even-length strings if and only if G generates
all strings. One way to do this is to have G' generate all even-length strings
whose odd-numbered letters form a string in L(G), and whose even-numbered
letters are arbitrary. We can do this as follows. Assume G is in Chomsky
normal form. Form G' by replacing every rule of the
form A → a by the rules
A → A'X, A' → a, and X → b for every letter b in Σ.
An alternate solution is to show that ALLEVENCFG and
ALLODDCFG are Turing equivalent, using an argument similar to that
of Question 8 (b) on this exam. If we show both ALLEVENCFG
≤m ALLODDCFG and ALLODDCFG ≤m
ALLEVENCFG, we know that if one of these languages is TR, then so
is the other. If both are TR, their intersection is TR, and the proof that
ALLCFG is not TD (from text and lecture) shows that it is also not
TR.
TRUE. In fact any language X at all is the language of some DIA. To see this, note that we can design δ to take every possible string to a different state. For example, if Σ has k letters a1,..., ak, we can define δ(qn, ai) to be q(k+1)n + i, so that δ*(q0, w) is the number denoted by the sequence of subscripts of the letters of w in base k+1 notation. Once we have a state for each string, we just define F to be the set of states to which δ* takes the strings in X. Of course if X is not a TD language, F will not be TD either.
TRUE. We can design a TM that stores the current state of the TCDIA on a tape. It then just scans the input, using the TM for δ to update the state after each letter and using the decider for F at the end to decide whether to accept. This is a finite number of δ computations followed by one F computation, and the hypotheses say that each of these will eventually halt.
FALSE. Since the set of DIA languages is the set of all languages, it is
in fact uncountable. But the set of TCDIA languages is countable, because
each such language can be specified by a finite string, giving the descriptions
of the Turing machines for that TCDIA's version of δ and F.
It also follows, from the fact that each TCDIA language is TD, that there
are countably many, since we observed that there are only countably many TD
languages.
FALSE. The fact that a briefly-described machine could possibly
output a very long
string should be pretty obvious, though it's not as obvious to prove.
By the Recursion Theorem, since we can make a TM that inputs a string w and
prints out ww, there must be some TM R that prints out (R)(R) (two
copies of R's description), a string that
is clearly longer than R.
A more concrete proof defines a family of machines {Mw} for
each binary string w, where Mw ignores its input and prints out
n ones, where n is the number denoted by w in binary notation. We could
build such a machine with description length O(|w|), by having it write
out w on its tape and then run some code that is constant for all w. But
the length of the output (if, say, w is a string of ones) is about
2|w|, which is larger than the machine description length for
long enough w.
The machine moves to the right marker of Tape 2. It then moves left until it finds a 0 or the left marker, changing any 1's it finds on the way to 0's. If it finds a 0, it changes it to a 1 and stops. If it finds the left marker, it goes back to the right one cell. It then goes right past any 0's to find the right marker, writes a 0 over it, goes right, and writes a new right marker (thus completing an update of x from 1k to 0k+1), and stops.
If X is the language of some ASTM, it is clearly TR because the ASTM is
itself a TM. (We could convert the three-tape ASTM to a one-tape TM, or
just quote the known fact that languages of three-tape TM's are TR.)
So the interesting part is to show that for every one-tape TM M, there is
an ASTM with the same language. My intended proof, which no one got, was
to cycle through all strings x on Tape 2, and for each one do the following:
Copy x to Tape 3, then check whether x happens to be an accepting computation
of M on w (subject to the encoding of the acception computations as binary
strings). We can do this check while obeying the ASTM restriction because
we know from lecture and the text that this can be done with an LBA. If
w is in L(M), we will find the correct x and accept, and if not we will search
for it forever.
Those of you who got this right used the following argument. Again
cycle through all possible strings x. Skip past all the x's that are shorter
than w. Then for each x longer than w, copy w onto Tape 3 and attempt to
simulate the computation of M on w. If it accepts or rejects, do likewise.
If it runs forever, so be it -- the simulation runs forever. If it hits the
right marker of Tape 3, stop, update x, and start over by copying w to Tape 3
again. If w is in L(M), we will eventually get to an x that is long enough
that we can complete the computation and accept. If not, we will either
reject or go
into a loop in one of the phases, or continue through all possible x forever.
The construction is essentially the same as for part (b), in either version. In my version, also check for an x that is a rejecting computation history and reject if you find it. Since M is a decider, our ASTM will eventually either accept or reject when it finds the correct history. In the other version,once x is long enough we will be able to complete either an accepting or a rejecting computation, since as M is a decider one or the other must exist.
Several valid ways indeed:
Given a Turing machine M, we must build a Turing machine N so that M accepts all even-length strings if and only if N accepts all odd-length strings. Let N delete the first letter of its input and simulate M on the remaining letters. (We don't really care what N does on empty input, since the empty string has even length.) There is then an odd-length string not accepted by M if and only if there is an even-length string not accepted by N.
The language in question here is the set of CFG's the either generate all even-length strings, or generate all odd-length strings, or both. The complement of this language is the set of CFG's that both fail to generate some even-length string and fail to generate some odd-length string. Since ECFG is TD, we can build a TM that tests every string for membership in L(G), until or unless it finds both an even-length string and an odd-length string that are not in L(G). If this happens it accepts, and otherwise it searches forever. This TM's language is the complement of the given language, so the given language is co-TR.
Last modified 20 March 2016