Q1: 10 points Q2: 10 points Q3: 10 points Q4: 10 points Q5: 10 points Q6: 10 points Q7: 20+10 points Q8: 40 points Total: 120+10 points
Exam text is in black, solutions in blue.
If C is any class of computers, such as DFA's, CFG's, TM's, strange variant TM's, etc.:
A language is Turing decidable (TD) if it is equal to L(M) for some Turing machine M that halts on every input.
It is Turing recognizable (TR) if it is equal to L(M) for any Turing machine M.
A function f from strings to strings is Turing computable if there exists a Turing machine M such that for any string w, M when started on w halts with f(w) on its tape.
Recall that if A and B are two languages, A is mapping reducible to B, written A ≤m B, if there exists a Turing computable function f: Σ* → Σ* such that for any string w, w ∈ A ↔ f(w) ∈ B. If such an f exists that is computable in polynomial time, we say that A is poly-time reducible to B, written A ≤p B. If f is computable in log space, we say that A is log-space reducible to B, written A ≤L B.
A homomorphism from Σ* to Σ* is a function f that obeys the rule f(xy) = f(x)f(y) for any strings x and y. It is determined by the strings f(a) for each letter of Σ and f(ε) must be equal to ε.
A boolean matrix is one whose entries are each 0 or 1, and where we define "addition" and "multiplication" as the boolean operators OR and AND, respectively. If A and B are each n × n boolean matrices, the matrix product AB is the matrix C such that for each i and j, the boolean value Cij is the OR, over all k from 1 to n, of Aik ∧ Bkj.
Given this definition of matrix product, we define the language BMM (for "boolean matrix multiplication") to be {(A, B, C): AB = C} and the language IBMM (for "iterated boolean matrix multiplication") to be {(A1,..., An, C): A1A2...An = C}. The variables A, B, Ai, and C each range over n × n boolean matrices. That is, all the matrices in a product must be square matrices of the smae size, and the number of matrices in an iterated product must equal the size of the matrices.
The following language is proved to be NP-complete in the text or in Exercises, and you may assume without proof that it is NP-complete. (There are lots of other languages proved NP-complete in the course -- this is the one you will need on this exam.
The language X over the alphabet {0, 1, #} is the set of strings in which every pair of #'s has at least one 1 between them.
The language Y over the alphabet {0, 1, #} is the set of strings w for which there exists a string v in {0, 1}* and strings x, y, and z in {0, 1, #}* such that w = x#v#y#v#z#.
The language Z is the set of encodings of undirected graphs G such that if the nodes of G are partitioned into any five sets, then there is at least one edge of G that has both its endpoints in the same set.
Given a directed acyclic graph G and a goal node g, we define three
versions of a In Version 1 of the game, there are no restrictions on the players'
moves beyond the definition above. In Version 2, a move is
prohibited if it would place the two tokens on the same node. In
y Version 3, a move is prohibited if it would take one player's
token to a node that was ever occupied by the other player's token.
TRUE. The assumption is irrelevant, as there is a poly-time algorithm
that inputs two DFA's and decides whether they are equivalent. We
first minimize both DFA's, using the algorithm presented in lecture
which is clearly poly-time, since one iteration makes a single pass
over the set of states, and there can be at most n-1 iterations before
the division of states into classes remains the same.
But there is still the problem of taking two minimal DFA's and
determining whether they have the same language. To do this we
try to construct an isomorphism between the state sets of the DFA's.
If we succeed, of course the languages are the same, and if we fail,
we will have found proof that they are not. We begin by mapping
the start state of the first DFA to the start state of the second,
rejecting if one is final and the other non-final. We then, for
each letter a, map the end of the a-arrow from one start state to
the end of the a-arrow of the other. We reject if this map fails
to be one-to-one (since then there are two strings equivalent for
one language and not for the other) or if we even map a final state
to a non-final state or vice versa. We continue with each mapped
state, mapping the endpoint of each arrow to the endpoint of the
matching arrow in the other DFA. If we complete a bijection of
the states that maps each arrow to a matching arrow, we have our
isomorphism, and other wise we reject.
FALSE. This language was proved in lecture and in the text to not be
TD, by reduction from ALLCFG. But EQCFG is
easily seen to be co-TR, as two grammars G and H are in the complement
of EQCFG if and only if there exists a string w that
is in L(G) or L(H) but not both, and the language ECFG
is TD. If EQCFG were TR, therefore, it would also be
TD, and it isn't.
TRUE. The minimal DFA has three states i, p, and d, with i the start
state and only final state, and transitions (i, 0, i), (i, 1, i),
(i, #, p), (p, 0, p), and (p, 1, i), with all other arrows going to d.
This DFA goes to its death state if it ever sees two #'s without a 1
in between, and it goes to p if it sees a # and is waiting for the 1.
FALSE. The key point is that the two copies of v in the string must be
identical, making this language similar to {ww: w ∈
Σ*} which we know is not a CFL. The proof uses the
CFL pumping lemma. Assume Y is a CFL, let p be Y's alleged pumping
constant, and let w =
#0p1p##0p1p# which is
in Y. If this w is broken into five strings as specified in the
CFLPL, we can show that the conclusion of the CFLPL is false. If
either the second or fourth strings contains a #, pumping down leaves Y
because every string in y has at least four #'s. Otherwise the second and
fourth strings may intersect at most two of the four groups of p consecutive
letters, and pumping down will destroy the property that the first group
matches the third and the second matches the fourth.
FALSE. We prove that Z is co-NP-complete, which would force NP = co-NP
if it were also NP-complete. (Proving that Z is in co-NP is not
sufficient for this conclusion.) Z is the complement of the language
5-COLOR, of graphs that can have their nodes divided into five
sets such that no edge has two endpoints in the same set. (Many of
you misread the definition of Z and argued that it was in NP, claiming
that a division into five sets would be a certificate. But it isn't,
since the definition says that every division into five sets
has a certain property, not just one.)
5-COLOR itself is clearly in NP. with the 5-coloring being the
certificate. It is easily proved to be NP-complete by reduction
from 3-COLOR. Given any undirected graph G, we must construct a
graph H that is 5-colorable if and only if G is 3-colorable. One
way to do this is to have H consist of G and two new nodes, with
edges from each new node to each old node and between the two
new nodes. In any coloring of H, the two new nodes must have two
colors and the old nodes must have colors chosen from the other
three, so the induced coloring of G is a valid 3-coloring. And
clearly any 3-coloring of G may be extended to a 5-coloring of H.
FALSE. This was a trick question of sorts, because it appears to be
asking whether "the regular languages are closed under homomporphism",
and you proved on the homework that they are. But the given statement
also says that the non-regular languages are closed under
homomprphism, because if R is not regular, f(R) must not be regular
to satisfy the "if and only if".
And in fact it is easy to map a non-regular language to a regular
one by a homomorphism. The simplest way is to have f(a) = f(b) =
ε, so that f(R) is the regular language {ε} for
any non-empty R, regular or not. Another example is to let R be
the standard non-regular language {anbn:
n ≥ 0}, and let f(a) = f(b) = b, so that f(R) is the regular
language (aa)*.
For every i and j, let Dij be the OR, over all k, of
Aik AND Bkj. We can calculate each
Dij with an OR gate that receives the output of n AND
gates. Then, for each i and j, we determine whether Dij
and Cij are equal with an AND or two OR's or an OR of two
AND's. We AND together the result of these n2 comparisons,
and that is our output. The depth of this circuit is 5 (AND, two
for the comparison, OR, AND) and its size is O(n3).
For each pair of input matrices A2i-1 and A2i, we
compute their product in depth O(1) and polynomial size as in part (a).
We then pair up these products and compute n/4 matrices, each the
product of four original matrices. We pair those up, and so on,
forming a balanced binary
tree of product operations. The tree has O(n) subcircuits
each of O(1) depth and O(n3 size, and the whole circuit
thus has depth O(log n) and size O(n4). We compare the
final result with C, using an additional depth 2 and size
O(n2).
We first need to show that IMBB is in the class NL.
My preferred way to do this is with an alternation game,
using O(log n) space and O(1) alternations, appealing to
the result on HW#6 that we can find the winner of such a
game in NL. We will design the game so that White has
a winning strategy if and only if the product of the
Ai's is equal to C. Black moves first, and
names nodes s and t such that (he claims) the entry
Cst is not equal to the s-t entry of the product.
We then construct a graph with n+1 columms S0,...,
Sn of n nodes each. We place an edge from node
x of Si-1 to node x of Si if and only
if either x = y or
the x-y entry of Ai is 1. There is a path from
the s node of S0 to the t entry of Sn
if and only if the s-t entry of the product is 1. So now
if Cst is 0, Black can win by exhibiting such
a path, and if Cst is 1, White can win by
exhibiting such a path.
If you don't like the alternation game, it should still
be clear that a single NL machine can go through the
entries of C in turn, proving each 1 entry correct by
exhibiting a path in this graph and proving each 0
entry correct by executing the Immerman-Szelepcsenyi NL
procedure for the complement of PATH.
To show completeness,
We reduce the known NP-complete problem PATH to IBMM.
Given a directed graph G of n nodes, and nodes s and t of G,
we need to construct
an instance of IBMM that is in the language if and only if
there is a path from s to t in G. Let G' be the directed
graph obtained from G by adding a loop at each vertext that
has no loop already. Let A' be the adjacency matrix of G'.
It is well-known that there is a path from s to t in G if
and only if the s-t entry of the matrix A'n-1,
defined using boolean matrix multiplication, is 1. We are not quite done,
because we need to construct an instance where the product is exactly C if
and only if the path exists. Let D be a matrix whose only 1 entry is
Dss, and let E be a matrix whose only 1 entry is Ett.
Then the product DA'A'...A'E, where there are exactly n-1 copies of A', is
equal to C, whose only 1 entry is Cst, if and only if the path
exists. Because we have n+1 matrices in our product, we must make all the
matrices n+1 by n+1 to have a proper instance of IBMM. We do this by adding
a single isolated node to G to give it n+1 vertices without affecting the path
question.
Given any directed graph G, nodes x and y, and number k, it is clear
that the predicate PATH(G, x, y, k), stating that there is a path of
at most k edges from x to y in G, is in NL. By Immerman-Szelepcenyi,
its complement is also in NL. White wins Version 1 of the game from
position (w, x, y) if
and only if there is a number k such that PATH(G, x, g, k) and NOT
PATH(G, y, g, k-1). From position (b, x, y), White wins if and only
there is a path from x of length k and no path from y of length k.
This is clearly in NL.
Many of you misread the definition of the game somehow to ignore
Black, saying that White wins the game if and only if a path from
x to g exists. But White does not win if Black reaches the goal
node first.
We need only reduce the known NL-complete language PATH to Version 1.
Given G, s, and t, we need to set up a Version 1 game that White
indeed wins if and only if the path exists. We can do this most
easily by adding a new isolated node to G and making it y, so that
Black can never move and White either wins if she can or else draws.
There is a complication, though, in that the game requires G to be
a directed acyclic graph and the PATH problem does not. It
is easy to adapt the NL-completeness proof for PATH to make the
graph acyclic, by placing a clock on the Turing machine's tape so
that no configuration can be repeated. Alteratively, we can reduce
PATH to PATHDAG by mapping (G, s, t) to (G', s, t) where
G' is made from n+1 copies of G as in the solution to Question 7 (c)
above. The reduction is pretty clearly in deterministic log space.
So it turns out that all three versions of the game are in NL, which
explains why I could not get my P-completeness proof for Version 2,
or my PSPACE-completeness proof for Version 3, to work. Proving this
of course suffices to answer parts (c) and (d). The fact is that
the PATH conditions laid out in my solution to part (a) are still
necessary and sufficient for White to win in the other versions.
Suppose White has a path of length k, and Black has no better path.
White can win in k moves by simply traveling along her path. Black
cannot win earlier since he has no shorter path, and Black cannot
prevent White from taking her path because if he moved to a node on
White's path before White got there, he could then proceed to g and
demonstrate the existence of a path shorter than White's.
I think, but haven't verified, that the matching completeness
results are true if White and Black have separate goal nodes.
The proof I had in mind, that Version 2 is in P, is still valid.
The simplest thing is to define an alternation game that can be
played in O(log n) space and appeal to the result that AL is
contained in P. To play the game, we need remember only the player
whose move it is, the name
of node g, and the current locations of the two players. We could
also recapitulate this part of the Alternation Theorem proof by
defining a game graph, with nodes representing positions in the
game, and marking each position as White-winning or not.
As I said in part (c), determining the winner in Version 3 is also in
NL. But we can still prove it to be in PSPACE without noticing that
the restriction has no effect. The game can be played in polynomial
time, because there can be at most n-1 moves by each player until
one player has won or both have reached sink nodes. Thus, since AP
is contained in PSPACE, we can find the winner in PSPACE.
To recapitulate the proof of that part of the Alternation Theorem,
We can evaluate
the game tree by a recursive algorithm, where now a position includes
the set of nodes that have been visited by each player. The recursion
requires an activation record at each step that is of polynomial size,
and since there are only polynomially many records on the stack at one
time, the entire algorithm uses only polynomial space.
Last modified 19 May 2017