Q1: 10 points Q2: 10 points Q3: 10 points Q4: 10 points Q5: 10 points Q6: 10 points Q7: 40 points Q8: 25 points Total: 125 points
Question text is in black, solutions in blue.
Some questions refer to two particular kinds of one-tape deterministic Turing machines. A Double=move Turing machine (DMTM) always moves exactly two squares left or right on a move. A Double-right Turing machine (DRTM) moves two squares when it moves right, but one square when it moves left. Recall that our Turing machines start on the leftmost square of the tape, on the first letter of the input (if any), and do not move if they are ordered to move left from leftmost square. We define ADMTM to be the set of pairs (M, w) where M is a DMTM, w is a string over the correct alphabet, and w ∈ L(M). We define ADRTM similarly.
The language REGCFG is the set of context-free grammars G such that L(G) is regular.
The language REGTM is the set of Turing machines M such that L(M) is regular.
The language CFLDFA is the set of DFA's M such that L(M) is a context-free language.
Recall that for two languages A and B, A ≤m B means that there is a function f, computed by a Turing machine that always halts, such that for any string w, w ∈ A if and only if f(w) ∈ B.
FALSE. Consider the language (10)*. As the DMTM moves, it will only ever see the odd-numbered positions of the tape and thus never change the even positions. Any decision it makes on reading 11 will also be made on 10, but one of these two decisions must be wrong because 11 is not in the language and 10 is.
TRUE. We want f to be a function from grammars to TM's, so that f(G) = M and L(G) is regular if and only if L(M) is. The easiest way to do this is to have L(G) = L(M). Given G, f constructs a TM that implements a CKY parser for G, or otherwise solves the decision problem for G, by accepting its input w if and only if w ∈ L(G).
FALSE. If f were such a function, it would take pairs (M, w) where M
does not accept w to grammars with non-regular languages. But the
given construction takes such pairs to Σ*, which is
regular.
It's a more interesting question whether this f is a mapping
reduction from ATM to the complement of
REGCFG. For this to be true, the language of the grammar
f(M, w) would have to be non-regular whenever M does accept w. But
the language of this grammar is the complement of the set of
accepting computation histories of M on w. In one manner of
speaking, there is only one of these, but we probably want to
consider a string to be an accepting computation history no matter
how many blanks it has at the each of its configurations. The set of
such strings, however, is still regular, so its complement is regular
and f is not a reduction to the complement of REGCFG -- it
always produces grammars with regular languages.
TRUE. Since every regular langauge is a CFL, every DFA has a language that is a CFL, and our decision procedure should return "true" for every valid DFA. It only has to tell whether its input represents a valid DFA, which an always-halting Turing machine clearly can do.
FALSE. One argument for this uses our discussion of the Recursion
Theorem. In lecture we proved that there exists a Turing machine SELF
that outputs its own description. By the same argument, one could
build a Java program that outputs its own description, and hence would
not be longer than its output.
But in any case it is easy to construct a Java program that gives an output longer
than its own source code:
public static void main (String [] args) {
for (int i=0; i < 1000000; i++)
System.out.println("a");}
TRUE. In lecture we proved that the complement of ATM
mapping
reduces to REGTM. Our function takes a Turing machine M
and a string w and produces a new Turing machine N as follows. On
input x, N accepts if x is of the form 0n1n, and
otherwise runs M on w. L(N) is a non-regular language if M does
not accept w, and is the regular language Σ*
otherwise.
If REGTM ≤m ATM were true,
REGTM would be TR, but the complement of ATM is
not TR, and thus any language it reduces to is not TR.
M' will have the same alphabet as M and have copies of all of M's states. Any transition of M that moves left is replicated exactly in M'. If (q,a;r,b,R) is any right-moving transition in M, we add a new state s to M' (a different new state for every such transition), replace (q,a;r,b,R) with (q,a;s,b,RR) and then add a transition (s,x;r,x,L) for every letter x in the alphabet. Thus any computation of M is replicated exactly by M' -- every right move in M is matched by a double-right move followed by a left move. Clearly any string w will now be accepted by M' if and only if it is accepted by M, so L(M) = L(M').
To reduce ATM to ADRTM, we need a function that
takes a pair (M, w), where M is an ordinary TM, and produces a
pair (M, w), where M' is the machine built from M in part (a) and
w is the same string in the input and output.
To reduce ADRTM to ATM, we need to take a
pair (M, w), where now M is an arbitrary DRTM, and produce a pair
(M', w) where M' is now an ordinary TM such that L(M) = L(M'). To do
this we again have M' have the same alphabet as M, have copies of all
the states of M, and copy all left-moving transitions from M into
M'. We again create a new state s in M' for every right-moving
transition (q,a;r,b,RR) in M, and give M' the new transition
(q,a;s,b,R) and a new transition (s,x;r,x,R) for every letter x in
the alphabet. Now any computation of M is replicated exactly by M',
taking two single-right moves to mimic any double-right move of M.
Clearly a string w is accepted by M' if and only if it is accepted by
M, so our mapping reduction from (M, w) to (M', w) is correct.
Let X be the set of all strings w =
w1w2...wn such that the string odd(w)
w1w3...wk, where k = n if n is odd and
k = n - 1 if n is even, is in L(M). Note that a string is in L or not
entirely independently of what letters occur in its odd-numbered
positions.
X is the language of a DMTM M' that has the same states as M, but
always moves two spaces left when M moves one space left, and moves
two spaces right when M moves one space right. If w is any string at
all, the computation of M' on w follows the computation of M on odd(w),
and will thus accept if and only if odd(w) is in L(M), which is the
definition of when w is in X.
We let the function f take an arbitrary string v =
v1...vn to the string
v1v1v2v2...vnvn.
Note that then odd(f(v)) = v. Clearly v is in L(M) if and only if
f(v) is in X, so L(M) ≤m X.
No one gave me a good definition of X, so no one got full credit
for this problem. Some people defined X as, for example, the set of
strings w1$w2$...wn$ such that
w1w2...wn is in L(M). This language
satisfies L(M) ≤m X, but it is not the language of any
DMTM because a DMTM cannot tell whether the even-numbered squares of
the input contain $'s.
We want a function g such that g(M, w) = (M', w') where M is an arbitrary ordinary TM, w is a string over the input alphabet of M, M' is a DMTM, and w' is another string, and M accepts w if and only if M' accepts w'. Given any M, let M' be the DMTM constructed in part (c) and let w' = f(w), where f is the function of part (c). Now the computation of M' on w' will mimic the computation of M on w step for step, and either both will accept or both will not.
We say that A "CFL-pumps at the beginning" if there exists p such that for every string w with w ∈ A and |w| ≥ p, w can be written as uvxyz where |uvxy| ≤ p, |vy| > 0, and for all i, uvixyiz is in A.
All we need here is the proof of the RLPL as given in Sipser, as it says that any regular language "regular-pumps at the beginning". Let X be any regular language, let M be a DFA such that L(M) = X, and let p be the number of states of M. Let w be any string in X with |w| ≥ p. As M reads the first p letters of w, it takes on p+1 states -- the state after reading 0 letters, after reading 1 letter, after 2 letters, and so forth until the state after reading p letters. Since there are only p states in M, some state must occur twice in this sequence. We let x be the string read up to the first visit to this state, xy be the string read up to the second visit, and z be the part of w read after the second visit. Then the conclusions of the RLPL are satisfied, and |xy| ≤ p because the revisit must occur during the first p letters of w.
Let G be the grammar S → 0S1, S → ε, so that L(G) is
the language {0n1n: n ≥ 0}. Given any
number p, let w be the string 0p1p. If w is
written uvxyz with |uvxy| ≤ p, then the strings v and y consist
entirely of 0's. If we pump up or down, then, we change the number
of 0's in the string without changing the number of 1's, and take the
string out of L(G). Thus this language does not "CFL-pump at the
beginning", no matter which p we choose. The CFL Pumping Lemma says
that we can pick u, v, x, y, and z so that |vxy| ≤ p, not so that
|uvxy| ≤ p.
Only one person got this right, which surprised me. Two people
thought that a finite language would not CFL-pump at the
beginning. But the condition only says that all sufficiently
long strings in the language must pump. If the language is
finite, we can choose p so that there are no strings in the
language of length p or greater, so the condition is vacuously satisfied.
Last modified 2 April 2011