Practice Exam Solutions for First Midterm

Directions:

• Answer the problems on the exam pages.
• There are six problems for 100 total points. Probable scale is A=93, C=69.
• No books, notes, calculators, or collaboration.
• The actual exam will have a time limit of 120 minutes, though it is not intended that you will need all the time.
• Questions 1 and 2 are "true/false with explanation" -- you get five points for a correct boolean answer, and up to five additional points for a convincing justification.

Questions are in black, answers in blue.

```  Q1: 10 points
Q2: 10 points
Q3: 15 points
Q4: 15 points
Q5: 20 points
Q6: 30 points
Total: 100 points
```

• Question 1 (10): True or false with justification: Suppose that in an instance of the original Stable Marriage problem with n couples (so that every man ranks every woman and vice versa), there is a man M who is last on each woman's list and a woman W who is last on every man's list. If the Gale-Shapley algorithm is run on this instance, than M and W will be paired with each other.

TRUE. This follows from Question 2 being true, because we know that the GS algorithm always gives a stable matching. To prove it directly, note that no man will propose to W unless he has been rejected by all n-1 other women. Let X be the first man to propose to W. At the time this happens, all the other women must be engaged to men they prefer to X. This can only happen if X is M, as otherwise M would have been preferred to X by some woman. So M proposes to W, she accepts and then everyone is engaged and the algorithm stops.

• Question 2 (10): True or false with justification: Suppose we have an instance of the original Stable Marriage problem with M and W as above. In any stable solution to the instance, M and W will be paired with each other.

TRUE. If M is instead married to W' and W is married to M', then M' and W' each prefer each other to their assigned spouse, and the matching is not stable. So if M and W are not married to each other, the matching is not stable.

• Question 3 (15): Suppose we are to receive a set of n jobs, each with a deadline, and we need to schedule them on a single machine. Whenever a job finishes on the machine, we must begin the available job with the earliest deadline. But we may receive the jobs in any order and at any time, and we may not begin a job before we receive it. Indicate an algorithm to schedule the jobs (providing the correct available job whenever a new job is needed) using O(n log n) total computation time for the scheduling. Argue carefully that your algorithm meets this time bound.

When we receive a job, we place it in a priority queue based on its deadline. When the machine becomes available, we remove the front job from the queue. We thus have n inserts and n extract-min operations, and the size of the queue never exceeds n. We know that with a heap, we can implement a priority queue so that the insert and extract-min operations take O(log n) time. Thus we have O(n) operations taking O(log n) time each for O(n log n) total time. There also might be O(1) overhead to handle each job, adding another O(n) which goes away when added to O(n log n).

• Question 4 (15): Suppose that we have n jobs to do, and that for each number i, the time needed to do job i is Θ(i2). Prove that the time needed to do all n jobs (assuming that there is no idle time) is Θ(n3).

(Recall the definition of the Θ symbol. If f and g are functions, "f = Θ(g)" means that both f = O(g) and f = Ω(g) are true. Thus "T(i) = Θ(i)" means that there are two constants c and d, with c > d > 0, such that for sufficiently large i (i greater than some i0), T(i) satisfies di2 ≤ T(i) ≤ ci2. When you show that the total time is Θ(n3), you may assume that n is sufficiently large relative to i0.)

We first show that the total time is O(n3). Since we have n jobs, and each is O(i2) and thus O(n2), the total time is O(n3) by the multiplication rule for big-O.

To see that the total time is &Omega(n3), note that the last n/2 jobs each have time at least (n/2)2 which is Ω(n2). Since n/2 is Ω(n), the total time is Ω(n3) by the multiplication rule for big=Ω.

• Question 5 (20): The graph Kn is defined to be an undirected graph with n vertices and all possible edges. That is, the vertices are named {1,...,n} and for any numbers i and j with i ≠ j, there is an edge between vertex i and vertex j. (Remember that an undirected graph may not have loops or parallel edges.)

Carefully describe the result of a breadth-first search and a depth-first search of Kn. For each search, describe the tree and indicate where there are non-tree edges. (There are choices made during the course of each search, but it turns out that these choices do not affect the shape of either tree.)

It may help to draw the result of BFS and DFS on the graph K4 or K5 before attempting to describe the general case.

For the BFS, the algorithm looks at all the neighbors of the root node before going to the second level. Since every node in the graph is a neighbor of every other node, every node other than the root is put at level 1. Thus the tree has one node at level 0 and n-1 nodes at level 1. The non-tree edges are exactly those edges between different nodes on level 1 -- there are (n-1 choose 2) of these.

For the DFS, the search of the root node (call it 1) will find another node 2, and then the recursive call on node 2 will find node 3, and so forth. None of the recursive calls can terminate while any undiscovered nodes remain. So the tree edges form a single path of n-1 edges, with one node on each level and thus a single leaf. There are back edges, (n-1 choose 2) of them, as each node has a back edge to each of its ancestors except its parent.

• Question 6 (30):

• (a,10) Suppose I am given a labeled undirected graph, in the form of an adjacency list. This graph has a vertex for each of the n towns in Massachusetts, and there is an edge from vertex A to vertex B if you can drive from town A to town B directly, without passing through another town. Each edge is labeled by its length in miles.

We need to place a sign in each town indicating the distance from it to Boston by the shortest route along the edges. Describe an algorithm to find this distance for each town, and give a big-O bound on this algorithm's running time in terms of n.

We carry out the Dijkstra algorithm for the single-source shortest path problem. (You are not required to know the name of this algorithm if you describe it in enough detail to show that you understand it.) We maintain a set of nodes S for which we know the shortest distance. Initially S = {Boston} and the distance from Boston to Boston is 0. We have a priority queue where each element is a town with the best distance found so far (the length of the shortest path that stays in S until its last edge). We repeatedly take the front element F from the queue, add town F to S (noting its distance to Boston for the sign), and then process the edges out of F. Processing an edge (F,G) means looking at the path formed by appending that edge to the path from Boston to F and computing the total distance. If we have no entry yet in the priority queue for G, we make one with the new distance. If there is such an entry, we compare the distance on it with the new distance and do a change-key operation if the new distance is smaller.

We need to carry out n extract-min operations on the queue, one for each element, and up to m processings of edges. Each of these operations takes O(log n) time, so our total time is O((n+m)(log n)). Note that we need to keep a pointer into the priority queue for each town, so that we can retrieve the queue element to do the change-key if needed. This pointer can also tell us if the endpoint of the new edge is already in S, or not yet in the queue.

• (b,20) Suppose I have the results of part (a) in a table, and each edge of my graph contains a pointer into this table. Thus, given town A, for every town B such that there is an edge between A and B, I can find the distance from A to B and the distance from B to Boston, each in O(1) time. Describe a greedy algorithm to determine a route from A to Boston that is optimal -- it is no longer than any other route in the graph. (To describe the route means to list all the towns to be visited, in the correct order.)

Prove that your algorithm takes no longer than any other algorithm (That is, that the route to Boston it finds is no longer than any other algorithm's route.) You should probably use mathematical induction.

Suppose we are finding the route to Boston from A. We look at all the edges (A,B), and for each we compute f(B) which is the sum of the distance s(B) from Boston to B (on B's sign) and the length of the edge from A to B. We choose the B that minimizes this sum. We let this (A,B) be the first edge on our path, recursively find the best path from B to Boston, and append this path to (A,B). We prove that this algorithm gets the optimal route by using the "greedy algorithm stays ahead" method. Consider any route from town A to Boston. For every town B on that route, we compute the sum of the distance traveled from A to B with the distance s(B) from B to Boston given on B's sign. This starts out being 0 plus s(A). It can never be smaller than this, because this sum is the length of an actual path from A to Boston and s(A) is the length of the shortest path. But the greedy algorithm maintains this sum as equal to s(A) throughout. A to Boston. This is because at every town B, the distance on the sign was set during the Prim algorithm to be the sum of the distance from B to some town C plus the distance s(C) from C to Boston. So when the greedy algorithm looks at C, it will see the chance to get the sum of the two distances to be s(B). When it finds the minimum, then, it will choose either this edge or another with the same property.