CMPSCI 601: Theory of Computation

Offered through the PEEAS distance learning program

Homework Assignment #3

David Mix Barrington

Solutions posted Thu 7 August 2003

Questions in black, solutions in blue.

Question 1 (30): For each of the following three problems, either prove that it is complete for NL under L-reductions or prove that it is in L:
- (a,10) REACH-OUT-2 = {(G,s,t): G is a directed graph with out-degree at most two and there is a path from s to t in G}
  This language is NL-complete. It is in NL because it is a special case of REACH and the NL algorithm for REACH will decide it. (To decide REACH-OUT-2, of course, you also have to tell whether the graph has out-degree two but this is very easy, certainly in L.)
  We need to reduce REACH to REACH-OUT-2, which means defining a function that takes any graph G (with particular vertices s and t) to a graph H (with vertices s' and t') such that H has out-degree 2 and there is a path from s to t in G iff there is a path from s' to t' in H. Here is one way to do this. H will have a copy of each vertex in G, plus some additional vertices. For every vertex x in G with out-degree d, make a binary tree with d leaves, with edges directed toward the leaves. Identify the root of this tree with x and the leaves with the vertices that x has edges to in G, adding new vertices for the internal nodes of the tree if any. This can be done in log space, as we just have to add the new nodes and edges to the description and they can be easily constructed by finding the out-degree of each node in G in turn.
  Now if there is a path from s to t in G, there is a path from s' to t' in H because we can traverse the tree segment corresponding to each edge in the G-path in turn. And if there is a path from s' to t' in H, it must be made up of tree segments for a sequence of edges in G, because the only edges in H come from trees and the only way to make a path in H is to traverse these trees.
  Since this mapping reduces the known NL-complete problem REACH to REACH-OUT-2, and we have shown that REACH-OUT-2 is in NL, we have shown that REACH-OUT-2 is NL-complete.
  Another way to solve the problem would be to argue that any nondeterministic Turing machine can be simulated, in the same space, by a nondeterministic Turing machine that has at most two choices at each move. Then applying the "REACH is NL-complete" construction to such a Turing machine always gives us an instance of REACH-OUT-2, and so shows that REACH-OUT-2 is NL-complete.
- (b,10) REACH-ACYCLIC = {(G,s,t): G is a directed acyclic graph and there is a path from s to t in G} Here you may assume that G is presented as a levelled graph (as in Spring 2003 HW#6, Question 12), so your algorithm knows that G is actually acyclic.
  First, note that since REACH-ACYCLIC is a special case of REACH, it is in NL, given the statement that we can tell whether an instance is acyclic. There are two good ways to show that any NL problem reduces to REACH-ACYCLIC.
  The first is to argue that any NSPACE(log n) Turing machine may be given a poly-time clock, which records the step number on an additional O(log n) bits of work tape and shuts off the machine if it exceeds a particular number of steps. Then applying the "REACH is NL-complete" construction to this graph gives a levelled acyclic graph and thus an instance of REACH-ACYCLIC.
  The second method is to reduce REACH to REACH-ACYCLIC directly. We need a logspace function that, given a directed graph G of n vertices and nodes s and t, will produce an acyclic leveled graph H and nodes s' and t', so that there is a path from s to t in G iff there is a path from s' to t' in H. Here is an easy way to do this. The nodes of H consist of n copies of the nodes of G, each of which constitute a level. For every i and every node x in G, we let "(x,i)" be the name of the copy of node x on level i. We make an edge from every node (x,i) to (x,i+1), and an edge from (x,i) to (y,i+1) whenever (x,y) is an edge in G. We let s' be (s,1) and t' be (t,n).
  If there is a path from s to t in G of length d, we can go from (s,1) to (t,d+l) in H by following the copies of each edge in the path in turn. Then we can get to (t,n) by taking a series of (t,i) to (t,i+1) edges, so there is a path from s' to t' in H.
  If there is a path from s' to t' in H, the edges in it consist of either (x,i) to (x,i+1) edges or edges mirroring edges in G. If we look at the latter edges in order, they give us a path from s to t in G.
  Since we have reduced REACH to REACH-ACYCLIC, and shown that REACH-ACYCLIC is in NL, we have shown that REACH-ACYCLIC is NL-complete.
  Two students found a web page from a Caltech course that shows that the language ACYCLIC is NL-complete. Unfortunately for them the language ACYCLIC is the set of graphs that are acyclic, not the set of acyclic graphs and pairs of points having paths. It's very important when reading and adapting a known solution that you know what is being talked about, in particular the definitions of the terms involved.
- (c,10) REACH-IN-1 = {(G,s,t): G is a directed graph with in-degree at most 1 and there is a path from s to t in G}.
  This language is in L and thus not NL-complete unless L=NL. The easiest way to see this is to note that (G,s,t) is in REACH-IN-1 iff (H,t,s) is in REACH-OUT-1, where H is the graph obtained from G by reversing all the edges. And we have shown that REACH-OUT-1 is in L -- in fact it is L-complete under FO-reductions.
  More directly, you can solve REACH-IN-1 as follows. Put a marker at t and then go backwards as far as you can -- because the graph has in-degree one you have at most one edge to go backwards on from any point. Accept if you reach s, reject if you reach a dead end or have gone more than n-1 steps. This is a logspace algorithm because you need to remember only where you are and how many steps you've gone.
(Recall that the in-degree of a directed graph is the largest number of edges that enter any one vertex, and that the out-degree is the largest number of edges that leave any single vertex.)
Question 2 (20): A group is a set with a binary operation that is associative (for any x, y, and z, (x times y) times z = x times (y times z), has an identity element (there exists e such that for any x, e time x = x times e = x) and has inverses (for any x there exists y such that x times y = e). The ITERATED-PRODUCT problem is as follows:
- Input: a multiplication table for a group (a function from G times G to G obeying the group properties), and a sequence of elements of the group (a string w in G^*). Note that if n is the input size, G has at most n elements and the length of w is at most n.
- Output: the product of the elements in the sequence.
Here is your problem:
- (a,15) Prove that the ITERATED-PRODUCT problem can be solved in log-space, that is, the function is in FL.
  First number the elements of G from 1 to n, or otherwise give them names of O(log n) bits if this hasn't been done already. Then the following algorithm solves ITERATED-PRODUCT:
```
          element = w[0]; i = 1;
          while (i < w.length) {
             element = table(element, w[i]);
             i++;}
          return element;
```
  This is logspace because we need only remember "element", the name of an element of G, and "i", a number that never exceeds n. The multiplication table and the array of elements "w" are part of the read-only input.
  If I want to get the right answer in the case where w.length is zero, I need to use the identity property to find and return the identity element in this case. If I have the identity, I can replace the first line with
```
          element = identity; i = 0;
```
- (b,5) Your argument for (a) should not have required both the associativity condition and the inverses condition. Prove that with the correct one of these conditions omitted, the problem is still in FL.
  My algorithm uses the associativity property because it computes "(((((w[0]*w[1])*w[2])*w[3])*w[4])*w[5])", for example, and without associativity there is no guarantee that this parenthesization of the product will give the same answer as other parenthesizations.
  But the algorithm makes no use of the inverse property. It uses the identity property only in the variant discussed above that allows w to be an empty sequence.
Question 3 (15): Two directed graphs G and H are isomorphic if there exists a one-to-one, onto function f from the vertices of G to the vertices of H such that for any two vertices x and y in G, there is an edge from x to y in G iff there is an edge from f(x) to f(y) in H. Prove that the language GRAPH-ISOMORPHISM = {(G,H): G and H are isomorphic} is in NP. (It is unknown whether GRAPH-ISOMORPHISM is NP-complete or whether it is in P, though it is thought at least not to be NP-complete.)
The basic idea is simple, to guess a mapping and determine whether it is an isomorphism. We need to define a data structure for the mapping, show that it can be guessed in polynomial time, and show that it can be determined in polynomial time whether the mapping is an isomorphism.
The mapping consists of a vertex in H for each vertex of G. If G and H each have n vertices, this is n log n bits to guess, which can be done in O(n log n) time, clearly polynomial.
Once we have a mapping, we must check for each pair of vertices x and y in G that (x,y) is an edge in G iff (f(x), f(y)) is an edge in H. We use two loops to check the n² pairs of edges. For each pair, we must look up f(x), look up f(y), and determine whether (x,y) and (f(x), f(y)) are edges. These steps can each be done in polynomial time, so the total time for the n² pairs is still polynomial. We accept iff each of the pairs satisfies the condition that (x,y) and (f(x), f(y)) are either both edges or both non-edges.
By the definition, it is clear that (G,H) is in GRAPH-ISOMORPHISM iff it is possible for this procedure to guess a mapping that causes it to accept. Since the procedure is poly-time, we have shown that GRAPH-ISOMORPHISH is in NP.
Question 4 (15): Consider the restriction of the HAMILTON-PATH problem where the given graph is guaranteed to be acyclic. Prove that the set {G: G is a directed acyclic graph with a Hamilton path} is in P. Here, unlike in Question 1 above, assume that the graph is merely given to you as an adjacency list or matrix -- it is known that in P you can check whether it is acyclic. (Hint: Look up "topological sort" on the Web or in an algorithms book.)
The topological sort procedure returns an ordering of the vertices so that if (x,y) is any edge in the graph, x comes before y in the ordering. Clearly if the graph has a directed cycle, no topological sort is possible and the topological sort algorithm will discover this. (In the depth-first search, if the search reaches an already-visited vertex then we know that there is a cycle.)
If we have a topological sort of the vertices, the only way to have a Hamilton path is to visit the vertices in the order of the sort. This is because no edge can go backward in the ordering, and if you ever skip a vertex in the ordering you would have to go backward to reach it afterward and thus your path cannot reach all vertices.
So the algorithm is to find the toplogical sort, and accept iff for every vertex except the last, there is an edge to its successor in the topological sort. This takes the time for the sort (well known to be polynomial), plus the time to check n-1 edges in the graph. The total time is thus polynomial and the problem is in P.
Question 5 (20): It was shown in lecture that the language 3-COLOR = {G: G is an undirected graph with a 3-coloring} is NP-complete. Consider the language 5-COLOR = {G: G is an undirected graph with a 5-coloring}.
- (a,10) Describe a log-space reduction from 3-COLOR to 5-COLOR.
  It is important to follow the definition of reduction carefully. We need a function f that inputs a graph G and outputs a graph H, such that H is 5-colorable iff G is 3-colorable. The easiest way to do this is to let H be G together with two new vertices, x and y. The edges of H are the edges of G together with new edges from each of x and y to each other vertex, including from x to y. If we can color G with colors 1, 2, and 3, then we can extend this to a coloring of H by making x's color 4 and y's color 5. If we have a 5-coloring of H, we can make a 3-coloring of G by removing x and y and all their edges. Since each vertex of G was connected to both x and y, the H-coloring didn't use either x's color or y's color in coloring G, so it colored G using only the other three colors.
- (b,5) Complete the proof that 5-COLOR is NP-complete.
  Since we have reduced a known NP-complete language to 5-COLOR, the only thing left to do is to show that 5-COLOR is in NP. To do this we need only describe a poly-time nondeterministic procedure that can accept a graph G iff G is 5-colorable. This is easy -- our procedure assigns a color to each vertex and then checks each edge in G and accepts iff every edge connects vertices of different colors.
- (c,5) It follows from (b) that there is a log-space reduction from 5-COLOR to 3-COLOR. Describe how you would construct this reduction.
  The facts that 3-COLOR is NP-complete and 5-COLOR is in NP imply that this reduction exists, but I don't know of a good way to construct it directly. (That is, I don't know of a simple transformation on graphs that takes 5-colorable graphs to 3-colorable graphs and non-5-colorable graphs to non-3-colorable graphs.) So we construct it by going through the proofs. Because 5-COLOR is in NP, there is a poly-time NDTM whose language is 5-COLOR. By the proof of the Cook-Levin Theorem, for every size n there is a boolean CNF formula, on input variables representing a graph with n nodes, that is satisfiable iff the graph is in 5-COLOR. By the construction in lecture, we can build a graph that is 3-colorable iff this formula is satisfiable. Thus this eventual graph is 3-colorable iff the original graph was 5-colorable.

Last modified 7 August 2003