CMPSCI 601: Theory of Computation

David Mix Barrington

Spring, 2003

Solutions to Homework Assignment #8

Distributed to on-campus students on Wed 14 May
(Revisions to Question 1 made on 4 May 2003.)

Question 1 (10 points + 5 extra credit): Prove the first Lemma from Lecture 23: that if T is any binary tree with n nodes, where n > 1, there is an edge in T that divides it into two pieces, each with at most 2n/3 nodes. Assume that each node in the tree has either zero or two children. You may want to prove and use the lemma that in such a tree, the subtree under any node has odd size.
For each node, consider the subtree under it. This has odd size. (Proof: by induction on the definition of no-only-child binary trees. Such a tree is either a single node or a root node whose children are each no-only-child binary trees. In the base case the size is one, which is odd. In the inductive case it is one plus the sum of the two subtree sizes, which is one plus two odd numbers, which is odd.)
We will search for a node a subtree of a size at least n/3 and at most 2n/3. We make a greedy search, starting at the root and taking the larger child each time. The only way this can fail is if we have a tree of size a, with a > 2n/3, and jump to its child with a subtree of size b, where a = 2b+1 and b < n/3. But since b and n are both odd numbers, b < n/3 implies n ≥ 3b+2, and 2b+1 < (2/3)(3b+2).
Alternate Question 1 (10 points, points cannot be combined with original Question 1): Let T be any binary tree of n nodes, possibly with only children. Assume that n > 9. Prove that there is an edge dividing T into two pieces, with a nodes in the piece without the root, such that the numbers a and n-a+1 are each at most 3n/4. (If you need a larger lower bound on n it will cost you only a point as long as it is constant.)
(This version reflects the use of the Lemma in the Theorem about CFL's. When the tree is divided along the edge, the subproblem containing the root must contain a node representing the other tree.)
Now we must find a node with a subtree of size b, where b and n-b+1 each are at most 3n/4. We apply the same greedy strategy. We can only fail if we jump from a node with subtree of size a to one of size b (with a ≥ 2b+1, so we'll assume a=2b+1), where a is too big and b too small. So a > 3n/4 and n-b+1 > 3n/4. The latter implies b < n/4 + 1, which implies a = 2b+1 < n/2 + 3. For a number to be greater than 3n/4 and less than n/2 + 3, clearly n must be small. It works for n=9 and a=7, but a check shows that it fails for n=10 and n=11. For n ≥ 12, clearly it can't work as 3n/4 ≥ n/2 + 3.
Question 2 (10 points): Prove the second Lemma from Lecture 23: that if a, b, and c are any three nodes in a rooted binary tree, such that none is an ancestor of another, there exists a node d in the tree that is an ancestor of exactly two of them.
The same greedy strategy from Question 1 works directly. The root is above three of the target nodes, and any leaf is above at most one. As we move down the tree taking the branch with more target nodes under it, we cannot go from three directly to one or zero -- at some point we break the three targets into two on one side and one on the other, and at this point we have our node that is above exactly two. (If the node where this split occurred were a target node, we would violate the condition about no target being an ancestor of another.)
Question 3 (40 points): If we have any definition of "addition" and "multiplication" on individual items, we can define multiplication on n-by-n matrices of those items. If A and B are such matrices, C=AB is defined to be the matrix such that for any i and j from 1 to n, C_i,j is the "sum", over all k, of A_i,k "times" B_k,j. Define ITMULT to be the problem of inputting n matrices A₁,...,A_n and returning the product A₁A₂...A_n. The definition of the ITMULT problem of course varies as we vary the definition of addition and multiplication.

(a,10) Using the observations about arithmetic in Lecture 24, prove that ITMULT over the integers is in ThC¹. Note: Assume that your input matrices have entries with at most n bits. How big could the entries of your output matrix be, in big-O terms?
First we check the size of the entries in the matrix, to make sure we can operate on them as we would on n-bit entries. An entry of the product is the sum of n^n-1 terms, each a product of n entries from the input matrices. (If you don't believe this prove it by induction on n.) These products of n n-bit entries each are at most n² bits long. Adding up n^n-1 numbers of n² bits could produce a number as long as n² times log(n^n-1) = n³log n bits. (Any correct analysis showing an upper bound polynomial in n is good enough.)
Now we compute the n-fold product by a binary tree, each node of which multiplies two n by n matrices. The depth of the final circuit will thus be log n times the depth needed to multiply two matrices. We claim that we can multiply two matrices in ThC⁰, and thus compute the entire product in ThC¹.
If we multiply matrices A and B to get C, each entry of C is the sum of n numbers, each the product of an A entry and a B entry. Multiplying each of these pairs of entries is in ThC⁰ by a result quoted in lecture, and adding the n products together is an instance of the ITADD problem, also known to be in ThC⁰. Performing two ThC⁰ operations in series is also ThC⁰. The only other thing to check is that the size of our circuit remains polynomial. But this is true because there are only n-1 multiplications of two matrices, each of these has only n² entries, and computing each entry involves only n integer multiplications and one ITADD operation. A polynomial number of operations each in ThC⁰ must still be polynomial size.
(b,15) Consider boolean matrix multiplication, where items are 0 or 1, "addition" is OR, and "multiplication" is AND. (Equivalently, we use boolean arithmetic where 1 + 1 = 1 and multiplication is normal.) Prove that ITMULT over this domain is NL-complete. (Hint: Consider the LEVELLED-REACH problem from HW#5 Question 12.) (Extraneous dollar signs removed 10 May.)
We associate any sequence of n such matrices with a levelled graph. The graph has n+1 levels of nodes, each with n nodes, and the connections from level i-1 to level i are given by the matrix A_i. That is, there is an edge from node x on level i-i to node y on level i iff the (x,y) entry of A_i is one.
To solve the boolean ITMATRIXMULT problem in NL, we map the input matrices to this levelled graph and ask the REACH question for node s on level 0 and node t on level n. (Recall that the decision-problem version of boolean ITMATRIXMULT also takes numbers s and t as input and returns the (s,t) entry of the product.)
To prove boolean ITMATRIXMULT to be NL-complete, we reduce LEVELLED-REACH to it. We first add degree-0 nodes to the input graph so that it has n levels of n nodes each, for some n. Then we simply construct the matrices from the graph as above, choose s and t to match the nodes requested in the LEVELLED-REACH problem, and submit the result to ITMATRIXMULT.
(c,15) Consider the min-plus domain, where entries are integers, "addition" means taking the minimum and "multiplication" means ordinary addition. The Floyd-Warshall algorithm for all-pairs-shortest-path in a graph with positive weights consists of taking the weight matrix of the graph and raising it to the (n-1)'st power in this domain. (See [CLRS], Chapter 25.2) You don't have to prove this, but do prove that ITMULT in the min-plus domain is in AC¹.
Here we show that we can compute the min-plus product of two n by n matrices in AC⁰. By the binary tree construction as in part (a), it follows immediately that the ITMATRIXMULT problem is then in AC¹.
Computing an entry of the product means taking the minimum of n numbers, each of them the sum of two input entries. Since we know we can add in AC⁰, since we defined an FO formula for addition much earlier in the course (the carry-lookahead adder), all we need to do is show that we can take the minimum of n numbers in AC⁰.
Consider an input structure with input predicate B(i,j), meaning "bit i of the j'th number is one". We want to express the predicate M(i), meaning "bit i of the minimum is one", as a first-order formula using B. We can successively express:

S(j,k), meaning "the j'th number is less than or equal to the k'th number", as
∃ m ∀r [(r < m implies (B(r,j) iff B(r,k))) and ((m=n) or (not B(m,j) and B(m,k)))]
T(j), meaning "the j'th number is the minimum", as
∀ k S(j,k)
and then M(i) as
∃ j (T(j) and B(i,j))

Since the M predicate is in FO, there is an AC⁰ circuit calculating it from the matrix of entries for B.

Question 4 (0 points):THERE IS NO QUESTION 4. Actually, if you want an extra question, you can prove that CRAM[logⁱn] (poly-many processors with O(logⁱ) parallel time) is the same as ACⁱ, ignoring uniformity issues. You could use either the circuit or the ATM characterization of ACⁱ. But please don't hand this problem in, as Kazu already has enough to grade.
In the interests of getting this solution out quickly I'll omit this one, though I may get back to it.
Question 5 (20 points): Prove that the SUCCINCT-REACH problem, defined in Lecture 24, is PSPACE-complete. This problem is to take a graph, whose set of nodes is exactly {w: w is a string of length n}, and whose edge predicate is defined by a circuit given to you as part of the input, and solve a given REACH problem on it. Typo fixed in this question 5 May 2003.
First we show that SUCCINCT-REACH is in PSPACE, by showing it is in NPSPACE and invoking Savitch's Theorem. With an NPSPACE machine we can apply the blundering algorithm, maintaining a variable for a vertex, starting it at s, guessing a successor, deterministicaly verifying that it is a successor using the circuit, and continuing until we reach t or give up after 2ⁿ tries.
To see that the problem is complete, consider any PSPACE problem and consider its CompGraph for the given input. This has a node for each configuration and an edge from each configuration to its successor. The input is in the language iff there is a path from the start configuration to the accepting configuration in this graph. All we need to do is construct a circuit C so that the SUCCINCT-REACH problem for C is exactly the REACH problem on this CompGraph.
But all this means is that the circuit must input two configuration names and tell whether the second configuration is the successor of the first. This means checking whether the tapes are the same except at the heads, and whether the changes at the heads are exactly those given by the state table of the Turing machine. A poly-size circuit can clearly do this -- in fact an AC⁰ circuit is good enough since the correctness of the transition can be expressed as a first-order formula.
Question 6 (20 points): Do Problem 11.5.18 in [P]. This problem uses definitions from Lecture 25.
As [P] suggests, we use the hypothesis "NP contained in BPP" to get the existence of a randomized algorithm that decides SAT most of the time but is wrong with probability 2^-n. (This uses the BPP amplification result stated, but not proved, in lecture.) We now use this untrustworthy SAT-tester to build a SAT-tester putting SAT in RP. Because RP is closed downward under reductions, it will follow that RP contains, and is thus equal to, NP.
We use the randomized SAT-tester to try to construct a satisfying instance for the input formula. This is similar to the homework problem about 3-coloring a 3-colorable graph. We construct ever-longer partial assignments to the variables (first assigning to x₁, then to x₁ and x₂, etc.), using the SAT-tester each time to confirm that our partial assignment can be extended to a full assignment. (Given a formula and a partial assignment, it is easy to construct a single formula that is satisfiable iff the partial assignment can be extended.)
If the SAT-tester gives us a correct answer each of the 2n or so times we use it, we will construct a satisfying assignment. We can verify that it is a satisfying assignment, and answer "yes" to the SAT question on the formula in absolute confidence. If we don't have such an assignment, we will answer "no".
If the SAT-tester makes a mistake, this could lead us to a non-satisfying assignment at the end, or a situation where we are told that our current partial assignment cannot be extended. If this happens we will give up and answer "no". All we need to show is that the probability of our making an incorrect "no" answer is less than 1/2.
If the formula is satisfiable, we can make an incorrect "no" answer only if the SAT-tester was wrong on one of our (at most) 2n calls to it. Let E_i be the event that the i'th call is wrong. We know that the probability of each E_i is at most 2^-n. By the union rule, the probability of the union of the E_i is at most the sum of the individual probabilities, or 2n2^-n, less than 1/2 except perhaps for very small n. Since this event includes all the cases where we make an incorrect "no" answer, we have shown SAT to be in RP.

Last modified 14 May 2003