# CMPSCI 741: Computational Complexity

### Spring, 2004

This is the home page for CMPSCI 741, an advanced graduate course in computational complexity. My particular interest is going to be in circuit complexity, stressing its links to formal language theory and descriptive complexity. The exact content of the course will be developed in conjunction with the attendees.

The prerequisite for CMPSCI 741 is CMPSCI 601 or equivalent. CMPSCI 741 may be repeated for credit, as I expect to do somewhat different things from what Neil and I did last spring.

Students in CMPSCI 741 may well be interested in the lecture notes from the Summer 2000 Park City Math Institute undergraduate program in complexity theory, which I co-taught with Alexis Maciel. They may be found here. In particular, Advanced Lectures 1-5 are quite relevant to this course.

We decided during the first class to use a textbook, Introduction to Circuit Complexity -- A Uniform Approach by Heribert Vollmer (Springer-Verlag). Students should make their own arrangements to obtain this book.

#### Announcements/Blog (11 Apr):

• (11 Apr) Last week's lectures focused on the complexity class SAC1 and groupoid problems. We proved this class closed under complement, following Vollmer 4.3.2, and argued for a characterization of the class in terms of AuxPDA's (nondeterministic machines with log space and a stack that doesn't count for space).

With groupoids, we looked at two problems: G-ITMULT defined as {(G,w,t): word w can multiply to element t in groupoid G} and G-GEN defined as {(G,X,t): some word over X, a subset of G, can multiply to t in G}. G-ITMULT is complete for SAC1 and we showed it to be closely related to CFL parsing. G-GEN is complete for P. One reason G-GEN is a harder problem is that the word over X might be of exponential length in the size of G.

We finished last Wednesday by discussing special cases of G-GEN and G-ITMULT for groups and monoids. This week I'll continue along these lines, for example showing that we can create a power table for a monoid in FO[log log n] and with this solve G-GEN and G-ITMULT for some groups.

• (24 Mar) It's time to focus in on topics for final presentations, as we discussed Monday. Audrey is planning to do something on the Wagner-McKenzie work on circuits over sets of natural numbers. Louis is going to do something about derandomization around logspace, working from the Saks survey paper. That leaves Richard and Alex to be heard from...

• (24 Mar) I'm away in Dagstuhl next week, though I'll be on email. The next class will be a week from Monday, when I plan to talk about the class SAC1, including some results on it from Vollmer chapter 4 and some other stuff about groupoids. I mentioned the nondetermistic groupoid multiplication problem today -- for a groupoid G (an arbitrary binary operation, not necessarily associative), the NMULTG problem is to input a string w in G* and an element t, and determine whether it is possible by parenthesizing w to make it multiply to t.

You might want to try proving that this problem is in SAC1. It's somewhat similar to parsing a fixed context-free language. If you get that, look at the version of the problem where the table for G is part of the input. (Reference: Bedard-Lemieux-McKenzie)

• (24 Mar) In this week's lectures we talked mostly about circuit lower bounds. On Monday we covered the lower bounds from Vollmer 3.1 and 3.2, and today I mostly talked about the exponential lower bound for programs over the group S3 computing the AND function. (This is in my 1989 JCSS paper and in Barrington-Straubing-Therien 1990.) We proved that S3 program map to depth-2 Mod-3-Mod-2 circuits, and to sums of linear characters over Z3. (A linear character is (-1) raised to a Z2 linear form.)

At the end I talked about a few ways to extend this result. In [BST90] it is shown how to prove lower bounds for the weight (number of nonzero terms) of a sum of linear characters over any finite field. This gives rise to lower bounds on the length of programs over certain solvable groups such as A4, but not S4 or T3. (The latter is the monoid of all functions from a three-element set to itself.)

It would be nice to extend the linear character lower bounds to rings as well as fields, but without being able to convert the representation into zero-one as for a field this doesn't seem to be possible. I've done some work on polynomials over Z6, see Barrington-Beigel-Rudich 1994 and Tardos-Barrington 1998.

I also mentioned work, not yet successful, towards a lower bound on the number of quadratic characters over Z3 to add to the AND function.

• (18 Mar) In the two lectures last week we gave a fairly complete proof of Ruzzo's circuit/alternation theorem, somewhat following Chapter 2 of Vollmer. If i≥1, then ACi is the set of languages of ATM's with O(log n) space and O(login) alternations, and NCi is the same thing with "time" in place of "alternations".

In both cases the circuits can be defined to be FO-uniform, but in the case of NC1 only we need "UE uniformity", where the uniformity predicates in FO include {(s,t,p): s and t are gate numbers, p is in {L,R}≤log n, and taking path p from s goes to t}.

FO-uniform AC0 is the set of languages of ATM's with O(log n) time and O(1) alternations, also known as the log-time hierarchy LH.. It is also equal to FO itself, where FO has either the BIT predicate or both + and *.

How much of this have we proved? All of the i≥1 part, and that LH equals FO-uniform AC0. That FO equals LH follows from the fact that DLOGTIME is contained in FO -- we haven't proved this but it's not hard given the argument that LOGBITCOUNT is in FO, which we did do -- and that each of the atomic predicates of FO (with BIT) are in DLOGTIME. For this last, remember that the arguments to an atomic predicate are only O(log n) bits long and thus the DLOGTIME machine has time to look at all of them. Both classes are pretty clearly closed under FO quantification.

Next week I'd like to do chapter 3, which given that we've already done Smolensky's Theorem shouldn't take too long. Note that I will be out of town and off of email for tomorrow (Fri 19 Mar) through the weekend, and I will be away the week of 29 March through 2 April at Dagstuhl. I should thus have a preliminary conversation with each of you next week about a final project.

• (8 Mar) Last Wednesday, 4 March, we cleaned up the mess from Monday and started in on Vollmer's chapter 2. We skipped his technical argument about simulating TM's of time O(t) with oblivious TM's of time O(t log t), noting that you don't need this to get circuits of size O(t^4). Today we'll start on the alternation-circuit theorem, which we did in 601 last year but which we'll now do more carefully.

• (2 Mar) My apologies for making something of a mess of Monday's lecture! I wanted to prove two results: that you can count up to poly-log many ones in an n-bit string in AC0, and Liupanov's theorem that any boolean function on n variables has a circuit of size 2n/2 times (1 + o(1)).

For the first result I got hung up on the primes for the hashing having O(log n) bits. There is a correct argument in PCMI advanced lecture 7:

• The language POLYLOGBCOUNT (actually, each language LOGkBCOUNT) is in AC0 by repeated application of LOGBCOUNT and LOGITADD, each of which is in AC0. To add up logk+1 bits, you divide the string into log n pieces, use LOGkBCOUNT to count each piece, and add up these log n sums with LOGITADD.

• We need to find a prime p that perfectly hashes the t locations of the ones, where t is O(logj n). There are (t choose 2) = O(log2j n) distances between pairs of positions, and each of these is at most n and can have at most log n prime divisors. So the number of primes that will not work is at most O(log2j+1 n), and there must be a good prime that is O(log2j+2 n). We hash to the string whose length is the least prime that works, and then use LOG2j+2BCOUNT to count the ones in the new string.
• Remember that I'd like you to write an FO formula (with BIT) for LOGBCOUNT.

For the other problem, I got one part of Vollmer's argument backwards. We want 2n-m functions on m variables, not 2m functions on n-m variables. 2n-m seems like a lot, but remember that the total number of functions on m variables is 22m, which is much bigger than 2n-m if m is sqrt(n).

I showed in lecture that no EXACTLY-k function is in AC0 unless k(n) is either polylog or n - polylog for each n. This is because a depth d poly-size circuit for such a function would lead to a subexponential depth d circuit for PARITY, which we know doesn't exist.

I mentioned that the symmetric functions in AC0 have been characterized. If f is a symmetric function, and for infinitely many n there is a k, not polylog or n - polylog, such that f on inputs of k ones differs from f on k+1 ones, then f is not in AC0. What I don't have, that I'd really like, is a direct proof that a symmetric function of this type in AC0 would force the existence of a superpolylog n where majority (or parity) on n variables was in AC0. This might be a good problem for one of you to work on.

Tomorrow, after cleaning up the mess I left you with Monday, we'll start in on Chapter 2 of Vollmer, relating circuits to TM's and maybe starting on a review of the proof of the circuit/alternation theorem.

• (26 Feb) This week we completed the reductions among TC0 complete problems, showing that ITADD, MULT, MAJ, BCOUNT, UBCOUNT, and SORTING are all FO-uniformly AC0 reducible to each other, and that ITMULT is P-uniformly reducible to the others. We still don't have a rigorous definition of FO-uniform reductions, of course. I suggested as an exercise that you write a FO definition of the problem LOGBCOUNT, the restriction of BCOUNT where all but the last log n bits are 0. Here the BIT predicate is more directly useful than + and *.

For the P-uniform reduction I followed Vollmer 1.4 (which mostly follows BCH), showing that with power tables for polynomially many short primes (clearly computable in L, actually computable in FO by Hesse) we can (a) do ITMULT with input and output in Chinese Remainder notation (CRR), (b) translate an n-bit number from binary to CRR, and (c) translate from CRR to binary. For (c), though, we also require the binary for the product M of the primes.

As a digression Wednesday, I used Kolmogorov complexity to prove a weak form of the Prime Number Theorem. Let σ(n) be the number of primes less than n. The PNT says that σ(n) = n/(ln n)[1 + o(1)] -- we proved σ(n) = Ω(n/log2 n) and then σ(n) = Ω(n/((log n)(log log n)2)). This argument, with some variations, is in Advanced Lecture 7 or thereabouts in the PCMI notes.

Next week we'll look at Vollmer 1.5 (the circuit size needed for arbitary boolean functions is Θ(2n/n) rather than the obvious upper bound of n2n) and characterize the symmetric functions in AC0.

• (20 Feb) In last Wednesday's class I presented some of the various reductions among TC0 problems from Vollmer 1.3 and 1.4. I realize I wasn't particularly clear about reducing ITADD to LOGITADD and BCOUNT. Suppose I have an n by n array ai,j representing the n numbers ai = sum of ai,j2j. I first use BCOUNT on each column to get a number sk = sum of sk,j2j. Then I assemble an array bi,j of log n numbers bi, each of n + log n bits. I do this by copying sj,k into bk,j+k for each k and j.

Next week we'll finish the TC0 reductions, including the non-uniform TC0 circuits for ITMULT, and then do the result in Vollmer 1.5 about the asymptotically optimal-size circuits for arbitrary boolean functions.

• (11 Feb) In this week's two classes I presented Smolensky's Theorem, together with the original probabilistic construction (the lower bound for Ramsey's Theorem by Erdos) on Monday as background. At least two of you now have the Vollmer book and I'd like to turn to following that more closely. You can read his version of Smolensky's Theorem (including the extension that AND-OR-MODp cannot do MODr, where p is any prime and r is not a power of p, whereas we only did p=3 and r=2 in class) in Chapter 3. Next Wednesday I want to present the constant-depth reductions in sections 1.3 and 1.4, such as those involving MAJORITY.

• (5 Feb) In the third class yesterday I presented Barrington's Theorem, that given a fan-in two circuit of depth d, and an input in {0,1}n, we can construct an instance of ITPROD(S5) of length 4d that has the same answer as the circuit. This shows, pending our exact definitions, that ITPROD(S5) is complete for NC1. (Recall that S5 is the group of all permutations of a five-element set.)

Next time we'll prove the Furst-Saxe-Sipser-Ajtai theorem that the parity language is not in AC0, even non-uniformly. Actually we'll prove the stronger Smolensky's Theorem that the parity language requires exponential size for constant-depth circuits with AND, OR, and MOD-3 gates. This result can be found in lecture A10 of the PCMI notes.

I left you with two exercises:

• For each d ≥ 2, find the smallest depth-d AND/OR/NOT circuit you can for the parity language.
• Find a lower bound on the size of depth-2 circuits for this language, matching your upper bound up to constants.

• (3 Feb) In the second class we mostly talked about the problems ITPROD(M), for a fixed finite monoid M, and ITPROD-MONOID, where the monoid is part of the input. We observed that ITPROD(M) is in DSPACE(1) and in NC1, and that ITPROD-MONOID is in DSPACE(log n) and in AC1. We showed that any regular language reduces to an ITPROD(M) problem for some M, and began the argument that there is an M such that ITPROD(M) is complete for NC1.

• (28 Jan) The four people who attended today are the same as those registered (once Alex registers). If anyone else is interested in joining the class please email me soon.

We talked about the models of computation we'll be using, particularly circuits (classes P, NC1, and AC0) and first-order logic (classes FO, FO+LFP). We talked about circuits that are trees (fan-out one), proving that AC0 and NC1 are the same with trees and circuits, and showing that poly-size trees (non-uniformly) are equivalent to NC1. Finally we looked at circuits for regular languages. I asserted that any regular language is in NC1. I defined the related problem ITMULT(M) for a finite monoid M, and asserted both that every regular language reduces to ITMULT(M) for some M, and that ITMULT(M) is in NC1. Next time we'll prove these things.