Solutions to Practice Final Exam for CMPSCI 611, Fall 2005

CMPSCI 611: Graduate Theory of Algorithms

Solutions to Practice Final Exam

David Mix Barrington

15 December 2005

Directions:

Answer the problems on the exam pages.
There are seven problems for 100 total points. Probable scale is A=90, B=60.
Problems 1-4 are true/false with justification. Five points for the correct boolean answer, up to five for justification. The truth of the answer should not depend on unproved assumptions such as P ≠ NP, except as explicitly specified in the statement.
If you need extra space use the back of a page.
No books, notes, calculators, or collaboration.

  Q1: 10 points
  Q2: 10 points
  Q3: 10 points
  Q4: 10 points
  Q5: 15 points
  Q6: 25 points
  Q7: 20 points

Question text in black, answers in blue.

Correction in orange made 20 Dec 2005.

Question 1 (10): (True or False with Justification) If any integer programming optimization problem can be converted in polynomial time to an equivalent linear programming problem, then P = NP.
TRUE. Integer programming is NP-hard, because we can take the NP-complete VERTEX-COVER problem and create an integer program whose minimum value is the size of the smallest vertex cover. (Make a variable for each vertex, restricted to be 0 or 1, make a constraint x_u + x_v ≥ 1 for each edge (u,v)., and minimize x₁ + ... + x_n.)
If we could reduce integer programming to an equivalent linear programming problem in polynomial time, we could solve the latter problem in time polynomial in its size by one of the P algorithms for LP that we didn't present in lecture. The size of the linear program must be only polynomial in the size of the integer program because the poly-time conversion algorithm only has time to output a linear program of polynomial size.
Question 2 (10): (True or False with Justification) There is a polynomial-time randomized algorithm for the CLIQUE decision problem that is always correct on (G,k) if there is no clique of size k and has a positive probability of being correct if there is a clique of size k. (Language slightly clarified from original posting.)
TRUE. The algorithm chooses k vertices randomly and independently and says "yes" iff these vertices form a clique. This is clearly polynomial time as there are only (k choose 2) edges to check. If there is no clique of size k, the set chosen by the algorithm will certainly not be a clique, and the algorithm will always correctly say "no". If there is a clique of size k, there is a small (at least 1/(n choose k)) chance that the algorithm will choose those vertices and correctly say "yes". So the algorithm meets the given specifications, though its success probability in the worst case is so small as to make it totally useless.
Question 3 (10): (True or False with Justification) Suppose that A is a polynomial-time randomized algorithm for Problem X, whose "yes" answers are always correct, and that on any member of X, A answers "yes" with probability at least 1/n². Then there is a polynomial-time randomized decision procedure B for X that is correct with probability at least 3/4 on any input, and B may be built from A without any unproven assumptions.
TRUE. B will run A independently k times, where k will be computed below, and answer "yes" iff at least one of the A-answers is "yes". If the correct answer is "no", then each A run will answer "no" and B will answer "no" -- in this case B is correct with probability 1. If the correct answer is "yes", the probability that each run of A answers "no" is at most 1 - 1/n². The probability that k consecutive runs of A all answer "no" is at most (1 - 1/n²)^k. This is the only situation in which B is incorrect.
So it suffices to pick k so that (1 - 1/n²)^k is at most 1/4. Remember that (1 - 1/n²)^n² is very close to 1/e. So setting k to be 2n², we get a probability about 1/e² which is less than 1/4 because e is about 2.71828, greater than 2.
Question 4 (10): (True or False with Justification) If P ≠ NP, then the general optimization problem TRAVELING-SALESPERSON has a poly-time approximation algorithm with approximation factor 1.5.
FALSE. It is true that the metric TSP has such an approximation. But for any constant k, we can prove that if a k-approximation to general TSP exists, then P = NP. Fix k. We can reduce the NP-complete HAMILTON-CIRCUIT problem to general TSP by a function that maps an undirected graph G to a weighted complete graph H with the same vertex set -- we map edges of G to edges of weight 1 in H, and non-edges of G to edges of weight (k+1)n in H. Then a Hamilton circuit of G maps to a Hamilton circuit in H of weight n, and any Hamilton circuit of H that does not come from a Hamilton circuit of G has weight at least n - 1 + k(n+1) which is greater than kn. Thus a k-approximation to the general TSP problem on H would decide whether G has a Hamilton circuit and thus decide HAMILTON-CIRCUIT. If this can be done in polynomial time, then P = NP.
Question 5 (15): We argued in lecture that the simplex algorithm takes only polynomial time to move from one vertex to a neighboring vertex with a higher value of the objective function. But we did not consider how many neighboring vertices there might be for any given vertex. If there were very many, there could be a problem as the algorithm might have to look at them all. State and justify a polynomial upper bound on the number of neighbors a vertex might have.
Suppose we have a linear program defined by the equation Ax = b, where A is an m by n matrix and the rows of A are linearly independent. We think of the feasible region as a polytope in (n-m)-dimensional space, where each vertex is the intersection of n-m of the constraint planes. If we are at a vertex, n-m of the n entries of x are zero. We can move to another vertex by choosing one of these n-m entries and increasing it, while adjusting the other variables to maintain Ax = b, until another entry becomes zero. There are at most n-m ways to do this and thus at most n-m neighbor vertices of the polytope.
Question 6 (25): In this problem we consider the special case of BIN-PACKING where the bin size is 1 and each item has weight greater than 1/3. Remember that the BIN-PACKING problem is to input a set of items with weights and determine a packing of them into the minimum possible number of bins.
- (a,5) Give a 3-approximation algorithm for this special case, and argue that the bound holds for your algorithm.
  Put every item in a separate bin. So if there are n items, the algorithm's score is n. The total size of the items is more than n/3, so the optimal algorithm must use at least n/3 bins. (In fact since no more than two of these items will fit in a bin, the optimal algorithm must use at least n/2 bins and this algorithm is a 2-approximation as well as a 3-approximation.)
- (b,10) Show how you can reduce this special case to the MAXIMUM-MATCHING problem, where the input is an undirected graph and the output is a matching with a maximum possible number of edges. Indicate how you will deal with items in the BIN-PACKING problem that have size ≥ 2/3.
  An item of size ≥ 2/3 must go in its own bin, so put each of these items in its own bin and consider the n other items. Make a graph G where each vertex represents an item and there is an edge between any two items whose total size is at most 1. A matching of e edges in this graph represents a valid bin packing where the two items represented by the endpoints of each edge go in a single bin, and the other n - 2m items each go in a single bin. This packing uses n - e bins (plus those for the big items already dealt with), and any valid packing must correspond to a matching in this graph. So the MAXIMUM-MATCHING solution, a matching with the largest possible value of e, gives us a bin packing with the smallest possible value of n - e and thus the smallest possible number of bins.
- (c,10) Give an O(n log n) time algorithm that solves this special case exactly.
  Sort the items from largest to smallest. Look at the largest and smallest items in the list. If they fit together in a bin, put them there, otherwise put the largest item in its own bin. Continue this process until all items are placed in bins. This takes O(n log n) time to sort plus O(n) time to reduce the list to an empty set.
  We claim that this packing is optimal. We only put an item in a bin by itself when there is no matching item available, and when we check a largest item L with a smallest item S, we have already matched all items smaller than S with other items. So if we put L in a bin by itself, there is no way to match L except with an item that has already been matched with an equal or larger item.
Question 7 (20): A set X of vertices in an undirected graph G is called a dominating set if every vertex of G is either in X or at distance one from a vertex in X. The DOMINATING-SET problem is the decision problem for the language {(G,k): G is an undirected graph that has a dominating set of size k}. Prove that DOMINATING-SET is NP-complete. (Hint: Reduce from VERTEX-COVER.)
We reduce VERTEX-COVER to DOMINATING-SET. Given an undirected graph G and a number k, we build a graph H so that G has a vertex cover of size at most k iff H has a dominating set of size at most k. We will do this by showing that G has a vertex cover of size exactly k iff H has a dominating set of size exactly k, so that the range of sizes of the two types of sets is exactly the same. (Then the reduction function will map (G,k) to (H,k).)
H will have a vertex for each vertex of G, except for any isolated vertices (vertices of degree 0), and a new vertex v_u for each edge e of G. Each edge e = (u,v) of G is replaced by a triangle of edges in H: (u,v), (u,v_e), and (v,v_e).
We must prove that G has a vertex cover of size k iff H has a dominating set of size k. If X is a vertex cover of G, then X is also a dominating set of H because every edge of G has a vertex in X incident to it, and so every triangle of vertices in H has at least one member of X in it, and so every vertex of H is either in X or adjacent to a vertex in X. So if G has a vertex cover of size k, then H has a dominating set of size k. Note that this argument assumes that every vertex of H is in one of the triangles, which is why we have to avoid replicating any isolated nodes in H.
It is a little more complicated to show that if H has a dominating set of size k, then G has a vertex cover of size k. Let Y be a dominating set of k nodes of H. Note that if any vertex in Y is an edge-vertex v_e rather than a vertex of G, we can replace it by one of the G-vertices for the edge e's endpoints and it will still be a dominating set. (This is because v_e only dominated three vertices, those in its triangle, and either of the other two vertices in this triangle also dominate these three vertices.) Once we have replaced all edge-vertices in Y by G-vertices, the new Y forms a vertex cover of G, because every edge of G must have at least one of its endpoints in Y for Y to be a dominating set of H.

Last modified 20 December 2005