Question text is in black, solutions in blue.
Q1: 10 points Q2: 10 points Q3: 10 points Q4: 10 points Q5: 10 points Q6: 25 points Q7: 35 points Q8: 10+10 points Total: 120+10 points
If C is any class of computers, such as DFA's, CFG's, TM's, strange variant TM's, etc.:
In particular, if C = P, then M must have a clock restricting it to some polynomial time bound, and if C = L, then M must have a marker restricting it to c log n worktape space for some constant c.
A language is Turing recognizable (TR) if it is equal to L(M) for some Turing machine M.
A language is Turing decidable (TD) if it is equal to L(M) for some Turing machine M that halts on every input.
A language is co-TR if and only if its complement is TR.
A function f from strings to strings is Turing computable if there exists a Turing machine M such that for any string w, M when started on w halts with f(w) on its tape.
Recall that if A and B are two languages, A is mapping reducible to B, written A ≤m B, if there exists a Turing computable function f: Σ* → Σ* such that for any string w, w ∈ A ↔ f(w) ∈ B. If such an f exists that is computable in polynomial time, we say that A is poly-time reducible to B, written A ≤p B. If f is computable in log space, we say that A is log-space reducible to B, written A ≤L B.
The following languages are proved to be NP-complete in the text or in Exercises, and you may assume without proof that each of them is NP-complete.
Recall that a quantified boolean formula is a statement of the form ∃x1: ∀x2: ∃x3;:... ∀xn: φ(x1,..., xn) where each xi is a boolean variable and φ is a boolean formula in conjunctive normal form. It was proved in the text and in lecture that the language TQBF of true quantified boolean formulas is PSPACE-complete, and you may assume this fact without proof.
A DogShow
object consists of a set D of dogs, a set E of
events, and a relation C ⊆ D × E such that C(d, e) means "dog d
competes in event e".
A Schedule
object for a DogShow
(D, E, C) is a
positive integer t and a function S from E to {1,..., t}. (Assume that t
≤ |E|.) A Schedule
is valid if there do not exist a
dog d and two distinct events e and e' such that C(d, e), C(d, e'), and
S(e) = S(e') are all true. (Thus a schedule is valid if no dog competes in
two different events that are scheduled at the same time.
The language DS-SCHED-OK is the set {(D, E, C, S): S is a valid schedule for (D, E, C)}.
The language DS-POSSIBLE is the set {(D, E, C, t): ∃S: (D, E, C, S) ∈ SCHED-OK and S has parameter t}.
Let M be any machine that takes an input string in {0, 1}*. We define a number of games for M. The solitaire word game for M has White write down a string w and win the game if and only if w ∈ L(M). The bounded solitaire word game for M and n further requires that |w| = n. The bounded alternating word game for M and n is similar, except that now w is formed alternately by White and Black naming letters in {0, 1} until n letters have been named. White still wins if and only if w is in L(M).
We define a number of languages based on these games. If C is any class of computers, SWGC is the set of computers M in C such that White wins the solitaire word game on M. Similarly BSWGC is the set {(M, 1t): M is a computer in C and White wins the bounded solitaire word game for M and n}, and BAWGG is the similar language for the bounded alternating word game. (We make the second input component 1n rather than the binary for n so that the input size will be O(n).)
We define two particular context-free languages called the Dyck languages, which are essentially strings of balanced parentheses. The language D1 has the context-free grammar with rules S → aSb, S → SS, and S → ε. The language D2 has these three rules plus the additional rule S → cSd.
FALSE. Note that the string aibj is in D1 if and only if a = b. So the set of strings {a1: i ≥ 0} is a pairwise D1-distinguishable set, because the string bi distringuishes ai from any other aj. This language is also easily proved to be non-regular by the Regular Language Pumping Lemma -- given any p, take w to be the string apbp, and note that pumping down removes one or more a's and yields a string not in D1.
FALSE. Both languages are in L, though only one of you had a correct proof
of the latter. To test a string for membership in D1, you can run
the standard test for balanced parentheses, counting the number of a's seen
minus the number of b's seen. The string is in D1 if and only if
this count is never negative for any substring, and finishes at 0.
This algorithm puts D1 in L because we can implement it by
keeping just one number, which fits in O(log n) bits because it cannot exceed
the input length n.
The most obvious algorithm to test for membership in D2 is to
keep a stack, pushing a's and c's, popping an a for each b and a b for each d,
rejecting if there is a mismatch, and accepting if the end is reached with an
empty stack. This algorithm uses O(n) space, and so does not put D2
into L, but it does not preclude the existence of another algorithm that does
put it into L.
Many people gave the following incorrect algorithm -- keep two counters,
one for the number of a's seen minus the number of b;s, and the other for
the number of c's minus the number of d's, and accept if neither counter goes
negative and the end is reached with both counters at 0. This algorithm
accepts all strings in D2, but also strings not in D2
such as acbd.
But it turns out we can essentially simulate the stack algorithm in O(log n)
space. It's easy to compute the size of the stack at each point -- it
is just the number of a's and c's seen so far minus the number of b's and d's.
What we need to confirm is that every time that algorithm sees a b or a d, the
top letter on the stack is the matching a or c. If the current stack size is
k, we just need to find the last letter than changed the stack size from k-1
to k. This can clearly be done by keeping a few counters with O(log n) bits
each.
FALSE. A machine M is in SWGTM if and only if it is not in the language ETM, since if there is any word in L(M) White can play it and win, and if there is no word in L(M) White will definitely lose. We proved in lecture and the text that ETM is not TD, so neither is its complement.
TRUE. A recognizer for SWGL just needs to test every word in Σ* for membership in L(M), never halting if L(M) is empty. But SWGL is not TD, and we can prove this by showing ATM ≤m SWGL. Given a machine M and a string w, we build a machine N such that L(N) is the set of accepting computation histories of M on w. We can build such an N that operates in log space because given an alleged computation history, it just has to check that the history starts with initial configuration of M on w, that that last configuration is accepting, and that each configuration follows from the previous one by the rules of M. This last step can be accomplished by keeping two counters to mark the positions being compared in the two configurations, and these two counters take O(log n) space where n is the length of input to N.
TRUE. As we observed above, SWGDFA is just the complement of EDFA, and we actually proved the latter to be in NL, a subset of P. A DFA D is in SWGDFA if and only if there is any path in D's state graph from the start state to any final state. We can test this by either depth-first search (in P) or by using nondeterminism to guess a path (in NL).
DogShow
and
Schedule
objects above, and their associated languages.
DogShow
object (D, E, C) has at
most n dogs and at most n events. Explain why the size of the input strings
to DS-SCHED-OK or DS-POSSIBLE for this object have size polynomial in n.
The input to DS-SCHED-OK is a set D of dogs, a set E of
events, a relation from D to E (a matrix of at most n2 bits), and
a function from E to {1,...,t} where t ≤ n (at most O(n log n) bits).
This is O(n2) bits in all.
The input to DS-POSSIBLE is just D, E, the relation C, and the single
number t, and these can also all be written in
only O(n2) bits.
For every dog d, and every event e, we check every event e' with e' ≠ e, and reject if S(e) = S(e'), C(d, e), and C(d, e') are all true. These are O(n3) possible checks of single attributes of the input string.
It is clear that DS-POSSIBLE is in the class NP becasue DS-SCHED-OK is a
verifier for it (there exists an S with parameter t
such that (D, E, C, S) is in
DS-SCHED-OK if and only if (D, E, C, t) is in DS-POSSIBLE) and we showed in
part (b) that this verifier is in P.
We can reduce 3-COLOR to DS-POSSIBLE. Given an undirected graph (V, E),
we let the set of events be V and let the set of dogs be E. The relation
C(e, v) is true if and only if v is one of the endpoints of the edge e.
We set t to be 3. Then the input (E, V, C, 3) is in DS-POSSIBLE if and only
if the events can be divided into three groups such that no dog is in two
events in the same group. And this is true if and only if edge of the graph
connects two vertices in the same group. The mapping is clearly poly-time.
Given a machine M and a number n, we need to create a machine N such that (M, n) is in BSWGTM if and only if (N, n) is in BAWGTM. We design N to ignore the even-numbered cells of its tape, and run M on the odd-numbered cells. Then if M is in BSWGTM, White can win the BAWG for (N, n) by playing her winning word for the BSWG on M, whatever letters Black puts into the even-numbered cells. Similarly if White wins the BAWG on (N, n), she must have a winning strategy that is independent of Black's moves, since Black's moves do not affect N's behavior at all. (For example, whatever she does to win when Black always plays 0 will be a winning strategy for any other Black moves.) The word she plays in her winning strategy must also be a winning word in the BSWG for (M, n).
We first show that BSWGL is in NP, by giving a poly-time verifier
for it. This verifier is the set of tuples (M, n, w, 1p(n)) where
p(n) is a polynomial time bound for M and w is in L(M). Since M is a log-space
machine with an explicit space bound, it also has an explicitly computable
time bound. And simulating M on w for at most p(n) steps can be done in
time polynomial in p(n), and thus polynomial in the length of the input to
the verifier.
To show that BSWGL is NP-complete, we reduce 3-SAT to it.
Given a 3-CNF formula φ with n input variables, we create a machine
Mφ that takes a string w as input, rejects if w is not
length exactly n, and otherwise tests whether w satisfies φ. (We assume
that φ has no redundant clauses and so has length O(n3.)
Clearly White wins the BSWG for (Mφ, n) if and only if φ
is satisfiable, and Mφ is a log-space machine because the only
read-write memory it needs is a pointer into w -- it stores φ within its
state table and stores w on its read-only input tape. The mapping from φ
to Mφ is clearly poly-time.
Given M and n, the game tree for the BAWG has depth n and size O(2n).
We can evaluate the winner by a recursive algorithm, where the player to move
at a given node of the tree
has a winning strategy if and only if they have a winning strategy for at least
one of the child nodes. The recursion has depth of n, and at each stage of
the recursion we need to store only the configuration of M, which takes only
O(n) space because M has made only O(n) moves. (Actually, since M is a
log-space machine, each configuration can be stored in O(log n) space and the
total stack space needed is O(n log n). In any case, membership in
BAWGL can be determined in PSPACE.
To prove completeness, we must reduce TQBF to BAWGL. Given
a quantified boolean formula ∃x1...φ, we need to create
a machine M and a number n such that White wins the BAWG (M, n) if and only
if the formula is true, and M runs in log space. The game action consists of
White and Black naming the values of the n variables, and when this is done
M must determine whether the boolean formula φ is true for the given
values. M must have the clauses of φ encoded within its state table,
so that it can check each clause in turn. The string of values is the input
to M, so it does not count against M's space bound.
P(i, i) is false. P(i, i+1) is true if and only if either wi is a and wi+1 is b, or wi is c and wi+1 is d. If i+1 < j, then P(i, j) is true if and only if either there exists some k such that both P(i, k) and P(k+1, j) are true, or if P(i+1, j-1) is true, and either wi = a and wj = b, or wi = c and wj = d. We reach a base case because each recursive call is to a case where j-i is smaller than it was, and when j-i is 1 or 2 we can evaluate it without recursion.
We have a gate for each value P(i, j) where i < j. For the P(i, i+1)
cases we have the OR of two AND-gates, computing
(wi = a AND wi+1 = b) OR (wi = c
AND wi+1 = d).
We actually only need gates for odd values of j-1 because the ones with even
values of j-1 are always false.
When j-i is an odd value greater than 1, our recursive definition says that
P(i, j) is the OR of:
This is an OR of about n/2 binary ANDs, where the input to the ANDs
are either basic values or other P values.
All in all, we have O(n2) values of P(i, j) to compute to
get to our desired value of P(1, n). Each OR computation involves O(n)
intermediate gates in the binary tree of binary ORs, so our total size is
O(n3). The computation of P(1, n) may go through O(n) other
values of P(i, j), and each of those may involve a binary tree of ORs of
depth O(log n), so our total depth is O(n log n).
Last modified 25 July 2016