Edits in orange made 28 October 2005.

Edits in green made 30 October 2005. (Minor clarifications only.)

Edit in purple made 31 October 2005 (also minor).

Edits in pink made 4 November 2005 (more significant).

There are five questions for
100 total points. Most are based on
lectures 12-15, and thus on Chapter 6 of the Adler notes. Question 1
is based on lectures 8-11 (Chapter 5). Many of these problems, like much
of Adler's Chapter 6, are taken from *Randomized Algorithms* by
Motwani and Raghavan.

Students are responsible for understanding and following the academic honesty policies indicated on the course main page.

Problem 3.1 (20): Kleene's Theorem says that a language A ⊆ Σ

^{*}is the language of an NFA or DFA iff it is denoted by a**regular expression**. When I present this in CMPSCI 250, I normally prove that NFA's can be simulated by regular expressions using the**state elimination**method, which is better for hand calculation. Here you'll describe and analyze another method, using dynamic programming.Let N be an NFA with n states, called {1,...,n}. Given any states s and t and any number i with 0≤i≤n, we define the language L(s,t,i) to be the set of strings that could be read by N, starting in state s, ending in state t, and using no

**intermediate state**numbered greater than i (recall the definition of intermediate state from the Floyd-Warshall algorithm).Show that using dynamic programming, we can calculate regular expressions for each language L(s,t,i) using a number of regular-expression operations that is polynomial in n. State and justify a bound on how long the regular expressions might be, in terms of n. (Originally said "time polynomial in n".)

(A regular expression is a letter or ∅, the union or concatenation of two regular expressions, or the star of a regular expression. The letters and ∅ are defined to have length 1. If R has length r and S has length s, then R+S and RS are each defined to have length r+s+1, and R

^{*}is defined to have length r+1.)Problem 3.2 (30): Consider a full ternary tree of height h, whose 3

^{h}leaves are each labeled with 0 or 1. We want to evaluate this tree in the following way -- each internal node is to be given a label that is the**majority element**of the labels of its three children.- (a,15) Prove that given any determistic algorithm that correctly
evaluates the root node of the
tree, there exists an assignment of 0's and 1's to the leaves
that requires the algorithm to look at all n = 3
^{h}leaves in the worst case. (Hint: Describe how, as an**adversary**of the algorithm, you could give consistent answers to the algorithm's queries about the leaves that leave the outcome in doubt until the last query.) - (b,15) Here is a randomized algorithm for evaluating the
root node of the tree. Given
any internal node, pick two of its three children uniformly at random and
evaluate them recursively. Then, only if these nodes disagree, evaluate the
third child. Prove that given any assignment to the leaves, the expected
number of leaves queried by this algorithm is less than n
^{0.9}.

- (a,15) Prove that given any determistic algorithm that correctly
evaluates the root node of the
tree, there exists an assignment of 0's and 1's to the leaves
that requires the algorithm to look at all n = 3
Problem 3.3 (20): If n is any integer, φ(n) is the number of elements of the group

**Z**_{n}^{*}, which is the product of the numbers p^{e}-p^{e-1}for every maximal prime-power factor p^{e}of n. If n is prime, φ(n) = n-1, and if n = pq with p and q prime, φ(n) = (p-1)(q-1). (The definition of φ(n) originally posted was incorrect in the case where n is not the product of*distinct*primes.)- (a,5) Show that if we are given both n and φ(n) and if
there is any prime p such that p
^{2}divides n (that is, if n is not the product of distinct primes), then we can find at least two nontrivial factors in deterministic polynomial (in log n) time. - (b,5) Show that if n = pq with p and q prime, and you are given both n and φ(n), we can factor n completely in deterministic polynomial time.
- (c,10) Suppose that you are given n and φ(n), that
φ(n) ≠ n-1, and that the algorithms of part (a)
and (b) fail, so you know that
n is the product of at least three distinct primes.
Let φ(n) = 2
^{r}s with s odd. Describe a polynomial-time randomized algorithm to factor n. (Hint: Pick a random a and look at the numbers a^{s}, a^{2s}, a^{4s},..., a^{φ(n)}. Show that there is a good chance that one of these numbers will help you to find a nontrivial factor of n. You will probably find it useful to quote a specific fact from Lecture #14.)

- (a,5) Show that if we are given both n and φ(n) and if
there is any prime p such that p
Problem 3.4 (20): Here are two applications of Chernoff bounds -- you may use the results quoted in the Adler text without proof. In each part ε is a fixed positive real number, to be treated as a constant with respect to n.

- (a,10) Find a number c such that you can prove that
at most ε2
^{n}binary strings of length n have more than (n/2) + c√n ones. Express your c in terms of ε and n. (Originally said "fewer than ε2^{n}".) - (b/10) Let k be any positive integer and suppose that you have a
Monte Carlo algorithm that decides whether a string x of length n
is in a language A, with success probability at least (1/2) + n
^{-k}. Find a polynomially-bounded function f(n), in terms of ε, n, and k, such that the probability that the majority of f(n) trials of this Monte Carlo algorithm is correct is at least 1 - ε.

- (a,10) Find a number c such that you can prove that
at most ε2
Problem 3.5 (10): (from CLRS) We have n oil wells in a square area, each with an x-coordinate and a y-coordinate. We are going to build an east-west pipeline across the region, and connect each well to the pipeline by a north-south pipe. How can we choose a y-coordinate for the east-west pipeline that will minimize the total length of the north-south pipes? Describe an algorithm to do so and analyze its running time. Argue that no other algorithm can be asymptotically faster than yours.

Last modified 4 November 2005